3.3.2.2 Basic Hypotheses

3. PERFORMANCE ANALYSIS TOOLS › 3.3 Data Clustering Analysis › 3.3.2 Usage Guidelines › 3.3.2.2 Basic Hypotheses
3.3.2.2 Basic Hypotheses


The problems associated with the equivalent workload element
approach can best be introduced by stating the two hypotheses
that were introduced by Artis for testing proposed workload
characterizations (ART76).  These hypotheses are the
"representative hypothesis" and the "stationary hypothesis."

The representative hypothesis tests how well the proposed
characterization represents each element of the workload.
Rather than testing the proximity of the mean value of the
workload to the proposed characterization, you must evaluate
the differences between the proposed characterization and
each element of the workload.

For example, consider the pathological example of a system
that runs 100 testing jobs using less than 5 CPU seconds and
100 production jobs using about 10 CPU minutes each day.  (Of
course, real workloads are much more complex.) This would
result in an equivalent job that required about 5 CPU
minutes.  Unfortunately, no job in the workload has resource
requirements similar to this equivalent job.  Therefore,
rather than simplifying the problem by reducing the number of
workload elements to be considered, we have changed the
nature of the problem.

The characteristics of a system that runs 200 equivalent
(that is, average) jobs per day would be much different from
those of the hypothetical system discussed above.  It is
interesting to note that the equivalent workload element can
only hope to pass the representative hypothesis for systems
where all of the workload elements are approximately equal.
To be valid, any proposed workload characterization must pass
the representative hypothesis test.

The stationary hypothesis evaluates the likelihood that the
proposed workload characterization is valid for future
workloads.  To test this hypothesis, you must wait and test
the characterization's applicability over a number of weeks
or months.  (Alternatively, you may produce the
characterization on older historical data and compare it to
more recent history that is not being used to do the
characterization.)

You should never consider a workload characterization as a
tool for forecasting unless this hypothesis has been tested.
Using the hypothetical system discussed above, the smallest
change in the ratio of test jobs to production jobs would
significantly change the workload characterization. Moreover,
even small changes in the average resource consumptions for
testing or production jobs would have the same effect.
Therefore, it is unlikely that any equivalent workload
element approach would ever pass the stationary hypothesis
test.

In general, very few proposed workload characterizations will
ever pass the stationary hypothesis test.  This does not mean
that most workload characterizations are invalid; it does
mean that most are unsuitable for use as forecasting units.