3.3.2.3 Application of Workload Characterization

3. PERFORMANCE ANALYSIS TOOLS › 3.3 Data Clustering Analysis › 3.3.2 Usage Guidelines › 3.3.2.3 Application of Workload Characterization
3.3.2.3 Application of Workload Characterization


Perhaps one of the most common workload characterization
studies is the determination of job classes.  Typically, the
analyst guesses at a solution and then evaluates how well it
fits the system's workload.  This is known as the ad hoc
approach.  For example, consider a hypothetical system where
the following job class structure has been proposed.


         JOB CLASS    CPU MINS    PRINT LINES
         =========    ========    ===========
             A            2           5,000

             B            5          20,000

             C          unlim        unlim


If 60 percent of the jobs are assigned to class A, 25 percent
to class B, and 15 percent to class C, the analyst might
assume that the classes provide a good representation of the
workload.

Applying the representative hypothesis test could reveal
problems.  For example, if 30 percent of the jobs use less
than 5 CPU seconds and print less than 2,000 lines, then they
would be poorly represented by the class A limits.  Moreover,
their service characteristics will be much poorer when they
are randomly mixed with the larger jobs in class A than they
will be if they are assigned to a class by themselves.

Another problem resulting from arbitrarily established job
classes is the specification of a job class limit that
bisects a natural structure in the data.  For example, if 15
percent of the jobs normally require between 110 and 130 CPU
seconds, imposing a two-minute limit could force the users of
jobs into class B to ensure that they are not canceled at the
two-minute limit.  This problem is also identified by the
representative hypothesis test since the approximate two CPU
minute jobs are poorly represented by the class B limits.

The characteristics of an installation's job mix are
continually evolving.  As a result, it is probably
unreasonable to apply the stationary hypothesis test to a job
class structure.  However, you should examine the job class
structure on a quarterly or semiannual basis.

Both the equivalent job and ad hoc batch approaches to
workload characterization discussed here show the problems
that result from using mean values of arbitrarily selected
limits.  To meet the requirements of the first hypothesis, we
must exploit the natural structure of the workload.
Statistical pattern recognition techniques (clustering) offer
a powerful tool for determining whether a natural structure
exists in an installation's workload data.