Previous Topic: 15.3.1 Descriptive Statistics ReportNext Topic: 15.3.3 Cluster Population Summary Report


15.3.2 Cluster Performance Summary Report


Workload Characterization Analysis 1 Cluster Performance Summary For: Monday, June 23, 2003 Clustering Execution Options Cluster Input Data: SAMPLE Clustering Method: RADIUS Maximum Cluster Radius: 3.0 Training Sample Size: 2000 Sample Trim Limit: 97.5 Pct. Sampling Percentage : 2.0 Pct. Include Outliers in Clusters: YES Include Sparse Clusters: YES Sparse Cluster Limit: 0.5 Pct. Report Cluster Population: YES Report Using Account Codes: YES Report Sparse Clusters : YES Report Outliers Separately: YES Create Cluster Index Graph: YES Create Population Graph: YES Input CA MICS Files: BATJOB Input Dataset Name: 'COGDA01.AUDIT.BATJOB' Clustering Variables: JOBTCBTM JOBEXCPS Reporting Variables: JOBNLR Delete obs. if zero: JOBTCBTM Save Input Observations as: SAVEOBS In User Dataset: 'COGDA01.O2.CAPACITY' Save Cluster Observations as: CLUSOUT In User Dataset: 'COGDA01.O2.CAPACITY' SYSID's Selected: XAL1 Zones Selected: 1 2 3 Hours Selected: 00 23 Date Ranges Selected: 01JAN01 31DEC03 Note: Global Data Selection exit in force. Selection of input data may be affected. Note: Non-CA MICS data element derivation exit in use.

_____________________________________________________________________________________ Cluster Feature Contents Feature --------Outlier Limits------- Name ------------- Description -------------- Lower Upper Cases ________ ________________________________________ ____________ ____________ ______ JOBTCBTM Job TCB CPU Time 0:00:00.00 0:0 1:05.70 50 JOBEDASD DASD EXCPS 0.00 38,933.00 50 ====== 100 Note: A given case may be classified as an outlier more than once if multiple analysis elements contain abnormal values. Therefore, the totals in this report will not necessarily agree with those of the Population section. _____________________________________________________________________________________

Workload Characterization Analysis 2 Cluster Performance Summary For: Monday, June 23, 2003 Cluster Population Summary Cluster Radius Normal Outlying Total % of Clustering Obs. Obs. Obs. Population Index _______ ______ ______ ________ _______ __________ __________ 1 0.00 0 14 14 0.70 0.00 2* 0.82 0 2 2 0.10 0.82 3* 0.82 0 6 6 0.30 0.51 4* 1.17 0 2 2 0.10 1.17 5* 1.20 0 2 2 0.10 1.20 6* 1.39 4 0 4 0.20 1.01 7* 1.43 0 3 3 0.15 1.02 8* 1.50 0 2 2 0.10 1.50 9* 1.55 0 5 5 0.25 1.22 10 1.61 1,541 0 1,541 77.05 0.34 11 1.71 15 0 15 0.75 0.77 12* 1.86 0 9 9 0.45 1.26 13* 2.01 0 2 2 0.10 2.01 14 2.04 4 8 12 0.60 1.19 15 2.07 129 0 129 6.45 0.95 16 2.22 12 0 12 0.60 0.80 17 2.25 0 11 11 0.55 1.69 18* 2.29 0 2 2 0.10 2.29 19* 2.45 0 2 2 0.10 2.45 20* 2.54 0 3 3 0.15 1.99 21* 2.70 0 2 2 0.10 2.70 22* 2.71 2 6 8 0.40 1.68 23 2.72 39 0 39 1.95 1.70 24 2.83 173 0 173 8.65 0.98 ====== ====== ======= ====== 1,919 81 2,000 100.00 '*' Indicates that the cluster is sparsely populated. Sparse clusters are defined as having a population that is less than 0.5% of the total population. In this study, the sparse cluster population limit is 10 cases. _____________________________________________________________________________________


 Figure 15-2.  Cluster Performance Summary report

The Cluster Performance Summary report is presented in 
three sections as described below:

The first section explains the execution options you 
have selected for the study being performed, and 
documents which reports or graphs will be produced.

For an explanation of the execution options and their 
respective usage, see Section 15.5 in this guide.

The second section of the report presents the feature 
contents of the clusters, including the limits used to 
determine outliers, and the number of times (cases) 
observations were declared as outliers because of a 
given feature value.

FEATURE       The name of the data element chosen for
NAME:         clustering.  This is normally a CA MICS data
              element but can be a computed (user defined)
              element if required.

DESCRIPTION:  The SAS label of the selected data element,
              from either the CA MICS GENLIB definition or
              supplied by the user.

OUTLIER       Statistical bounds to determine if a given data
LIMITS:       value is considered "normal" or represents an
              abnormal condition.  These values are computed
              based on the Sample Trim Limit value specified
              on the Execution Options panel.

LOWER:        Cases (observations) whose value for this
              feature are less than this defined value are
              considered outliers.  Given that most
              performance measurement data is positive in
              scope, the current implementation uses a value
              of zero for this boundary.

HIGHER:       Cases (observations) whose value for this
              feature are greater than this defined value are
              considered outliers.

CASES:        The number of cases (observations) that were
              declared as outliers because of this feature
              value.  Note that a given case may be flagged
              for multiple times and therefore the number of
              cases presented in this section may exceed the
              total number of outlying observations in the
              next report section.

              A footnote to this affect is printed at the
              bottom of this report section.

The last report section presents a summary of the size,
population and general performance of each cluster.

CLUSTER       The cluster number.  Cluster numbers are
NUMBER:       assigned sequentially to the patterns that are
              identified by the algorithm.  Note that the
              order in which the clusters are identified is
              not an indicator of merit.

RADIUS:       The geometric distance from the cluster center
              centroid to the outer boundary of the cluster,
              expressed in terms of Standard Deviations.  The
              outer limit of this value is defined as the
              Maximum Cluster Radius value on the Executions
              Options panel.

NORMAL        The count of observations within this cluster
OBS:          where all feature values were found to be
              "normal" in a statistical sense.  In this
              implementation, normal feature values are those
              that are less then the value determined by the
              Sample Trim Limit.  For example, if the Sample
              Trim Limit is 97.5%, then all feature values of
              "normal" clusters would reside below the 97.5
              percentile of the sample.

OUTLYING      The count of observations within this cluster
OBS:          where one or more feature values were found to
              be "outliers" in a statistical sense.  In this
              implementation, outlying feature values are
              those that are greater then the value
              determined by the Sample Trim Limit.  For
              example, if the Sample Trim Limit is 97.5%,
              then at least one feature value of "outlying"
              clusters would reside above the 97.5 percentile
              of the sample.

TOTAL OBS:    The sum of NORMAL and OUTLYING observations.

% OF          The percent of the total sample populations
POPULATION:   represented by this cluster.

CLUSTERING    Formally called the Performance Index, this
INDEX:        metric was renamed in this implementation to
              avoid confusion with similar terms in the z/OS
              Workload Manager.  It is the Root Mean Square
              of the distances of all cases (observations)
              within the cluster, and serves as a simple
              measure of clustering effectiveness.  The lower
              this value, the tighter the fit of the cluster
              data.  This is not to say that outliers are not
              present; only that they are not distorting the
              cluster shape by their presence.

Note the footnote at the bottom of this report section.  It
refers to "sparse" clusters and presents the definition and
limits used for determining which clusters are considered
"sparse".  Sparse clusters are generally populated by
outliers and are often dropped from further analysis after
their contents are reviewed.