The following table shows average CPU utilizations collected
for a 3090-200J processor, and illustrates the problems that
often occur with historical observations:
Week Observation % CPU
Ending Number BUSY
======= =========== ======
31OCT97 1 71.0
07NOV97 2 72.0
14NOV97 3 72.2
21NOV97 4 73.8
28NOV97 5 62.5
05DEC97 6 74.0
12DEC97 7 75.2
19DEC97 8 75.0
26DEC97 9 53.7
02JAN98 10 61.0
09JAN98 11 76.4
16JAN98 12 78.0
Figure 7-4 shows a scatter plot of the data. A linear
regression model developed for this historical CPU
utilization data has the following parameters:
n = 12, the number of historical observations
b = 62.70, the y intercept
m = 1.17, the slope of the line
2
r = 0.25, the coefficient of determination
F = 0.02, the F value
p = 0.90, the probability that we should reject the
hypothesis
s = 6.68, the standard error
e
The predicted and residual values for the historical data
series are shown in the following table:
Week Observation % CPU Est Residual
Ending Number BUSY % CPU (error)
======= =========== ====== ===== ========
31OCT97 1 71.0 63.9 7.1
07NOV97 2 72.0 65.1 9.9
14NOV97 3 72.2 66.2 6.0
21NOV97 4 73.8 67.4 6.4
28NOV97 5 62.5 68.6 -6.1
05DEC97 6 74.0 69.8 4.2
12DEC97 7 75.2 70.9 4.3
19DEC97 8 75.0 72.1 2.9
26DEC97 9 53.7 73.3 -19.6
02JAN98 10 61.0 74.5 -13.5
09JAN98 11 76.4 75.6 0.8
16JAN98 12 78.0 76.8 1.2
As you can see in the model parameters and residual values in
this table, the proposed model fits the historical data very
poorly. In many cases, these problems are introduced by
poorly behaved historical data rather than by the type of
model selected by the analyst. In this example, three
observations in the historical data (28NOV97, 26DEC97, and
02JAN98) are significantly different from the remainder of
the historical data points. Investigation reveals that these
three weeks represent holidays, presenting two alternatives:
o Compensating the historical data points. For example, you
could attempt to compensate for the missing data by
multiplying by some constant. Unfortunately, such
constants are guesses made by the analyst. Therefore, we
do not recommend that you compensate historical data.
o Deleting the errant historical data points. Although this
reduces the number of points available for developing the
model, it does not introduce any of the analyst's biases
into the modeling process and is statistically defensible,
since these weeks really do represent a different category
of work for the processor.
Deleting the historical observations for the holiday weeks
results in a substantially better model, giving significantly
improved parameters. The parameters of the model are shown
below:
n = 9, the number of observations
b = 70.78, the y intercept
m = 0.57, the slope of the line
2
r = 0.93, the coefficient of determination
F = 162, the F value
p = 0.0001, the probability that we should reject the
hypothesis
s = 0.64, the standard error
e
The predicted and residual values for the model that is
developed from the historical series with the three holiday
weeks deleted are shown in the following table.
Week Observation % CPU Est Residual
Ending Number BUSY % CPU (error)
======= =========== ====== ===== ========
31OCT97 1 71.0 71.4 -0.4
07NOV97 2 72.0 71.9 -0.1
14NOV97 3 72.2 72.5 0.3
21NOV97 4 73.8 73.0 -0.8
28NOV97 5 . 73.6 .
05DEC97 6 74.0 74.2 -0.2
12DEC97 7 75.2 74.7 0.5
19DEC97 8 75.0 75.3 -0.3
26DEC97 9 . 75.9 .
02JAN98 10 . 76.4 .
09JAN98 11 76.4 77.0 0.6
16JAN98 12 78.0 77.6 -0.4
The model developed from the historical data series after the
three holiday weeks were deleted is significantly better than
the model developed before this deletion. This example show
the value of deleting errant historical data points. Note
that the WEEKS timespan is probably more attractive for
building models since there are often too few monthly
observations for deletion to be an attractive alternative if
the MONTHS timespan is used.
HOLIDAY CPU DATA | | 81 + | | | 78 + * | | * | 75 + * * | * | * | 72 + * * |* % | | 69 + C | P | U | 66 + | B | U | S 63 + Y | * | | * 60 + | | | 57 + | | | 54 + * | -+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+ 1 2 3 4 5 6 7 8 9 10 11 12 OBSERVATION NUMBER
Figure 7-4. Weekly CPU Utilizations
|
Copyright © 2014 CA.
All rights reserved.
|
|