3.3.3.1 Clustering Input Descriptive Statistics Report

3. PERFORMANCE ANALYSIS TOOLS › 3.3 Data Clustering Analysis › 3.3.3 Standard Output › 3.3.3.1 Clustering Input Descriptive Statistics Report

3.3.3.1 Clustering Input Descriptive Statistics Report


The Descriptive Statistics Report provides a summary of the
statistics that are calculated for the cluster features.
Figure 3-33 illustrates a sample Descriptive Statistics
Report.

                        Data Clustering Analysis
                  Clustering Input Descriptive Statistics
                        For: Thursday, June 19, 2003

      Summary for Total Sample:
  Number of Sample Observations:     2,000
 Feature    Minimum        Maximum        Average      Std. Dev.       CV
________ ____________   ____________   ____________   ____________   ______
JOBTCBTM   0:00:00.01     0:40:00.28     0:00:13.14     0:01:42.89     7.83
JOBEDASD         1.00   1,154,523.00       5,992.30      34,582.08     5.77

    Summary for Trimmed Sample:
  Number of Sample Observations:     1,919
 Feature    Minimum        Maximum        Average      Std. Dev.       CV
________ ____________   ____________   ____________   ____________   ______
JOBTCBTM   0:00:00.01     0:01:00.81     0:00:03.74     0:00:07.71     2.06
JOBEDASD         1.00      38,096.00       2,505.00       4,508.48     1.80


 Figure 3-33. Descriptive Statistics Report

Note that the report has two sections: one for the original
sample randomly taken from the input data; and the other, the
sample data after trimming, using the Sample Trim Limit value
that you specify on the Clustering Execution Parameters
Option screen.  See Section 3.3.5.9.

The Descriptive Statistics Report contains the following
fields:

OBSERVATIONS: The number of observations selected for
              processing.

FEATURE:      The feature name.  The feature names that are
              listed in the report correspond to the features
              that are specified on the Data Clustering
              screen.  In Neugents technology, this is also
              called a "pattern".

MINIMUM:      The minimum value observed for the feature in
              the sample.

MAXIMUM:      The maximum value observed for the feature in
              the sample.

AVERAGE:      The average calculated for the feature
              observations in the sample.

STD DEV:      The standard deviation calculated for the
              feature observations in the sample.

CV:           The coefficient of variation.  The CV is
              calculated by dividing the standard deviation
              by the average.