6.6.3 Number of Samples and Statistical Validity

6. DATA SOURCES › 6.6 Notes on RMF Samples › 6.6.3 Number of Samples and Statistical Validity

6.6.3 Number of Samples and Statistical Validity



The number of RMF samples that will be present in a database
observation is determined by the sampling rate, the length of
each measurement interval, and the number of intervals in the
observation.  The sampling rate is determined by the CYCLE
statement, and the interval length by the INTERVAL statement.
Each is specified in the RMF control member ERBRMFnn in the
system parameter library SYS1.PARMLIB ("nn" default is "00").
Each RMF file in the CA MICS database contains common data
elements that provide this information:

o  CYCLETM  - The length of an RMF cycle, in seconds.

o  DURATION - The length of the measurement interval.  In the
              DETAIL timespan this is the length of an RMF
              interval.  In higher timespans, it is the total
              duration of all intervals that were in the
              summarized observation.
 
o  INTERVLS - The number of RMF intervals.  In the DETAIL
              timespan, this is always 1.  In higher
              timespans, it is the total number of RMF
              intervals that were in the summarized
              observation.

The number of samples expected in an observation in any
timespan is equal to DURATION/CYCLETM.  However, in the
summarized file HARDTA, fffRSMP is set to the value of all
fffRSMP occurrences in ALL the records that are being
summarized.  For example, DTARSMP in a single HARDTA record
is the sum of DVARSMP for each of the HARDVA records for that
time interval and devices of that type.  For this reason,
fffRSMP may be much larger than DURATION and CYCLETM would
predict.

The statistical confidence that you have in a measurement is
related to the number of samples from which it is derived.
There is a short discussion of this subject in the section
"INTERVAL and CYCLE Options" in the RMF User's Guide.

The value of the fffRSMP field should be used whenever you
perform calculations to determine percentage or average
values from the raw sample counts that were processed into a
CA MICS file. This avoids the taking of the "average of
averages" when measurements do not include the same number of
samples.

Tell Technical Publications how we can improve this information