One problem that analysts often encounter when attempting to
forecast computer requirements is the high degree of
variability shown by the observations in an historical data
series. These variations are typically caused by random
elements (for example, test jobs) that often exist in a
system's workload. The following table shows the monthly
observations of test jobs that are processed by a moderately
sized MVS system, illustrates this problem.
Observation Test
Month Number Jobs
======= =========== ======
JAN98 1 2900
FEB98 2 3070
MAR98 3 2950
APR98 4 3080
MAY98 5 3200
JUN98 6 3150
Figure 7-5 shows a scatter plot of the data. A linear
regression model developed for this historical data series
has the following parameters:
n = 6, the number of historical observations
b = 2881, the y intercept
m = 50.6, the slope of the line
2
r = 0.68, the coefficient of determination
F = 8.47, the F value
p = 0.04, the probability that we should reject the
hypothesis
s = 72.6, the standard error
e
The predicted and residual values for the historical data
series are shown in the following table:
Observation Test Est. Residual
Month Number Jobs Jobs (error)
======= =========== ====== ====== ========
JAN98 1 2900 2932 -32
FEB98 2 3070 2982 88
MAR98 3 2950 3033 -88
APR98 4 3080 3084 -4
MAY98 5 3200 3134 66
JUN98 6 3150 3185 35
Although one could argue that the model produced is
marginally acceptable, only 68% of the variability in the
historical data is accounted for by the model. You can treat
this type of apparent randomness in historical data through
the use of data smoothing techniques. Although a wide
variety of techniques are available, perhaps the simplest is
the geometric moving average (BAR77). A geometric moving
average (GMA) is attractive in that you only need to
sacrifice one historical data observation to calculate the
smoothed series. (Other techniques require you to sacrifice
many more historical data observations. For example, a
five-point moving average requires you to sacrifice the first
five observations.) You can calculate a geometric moving
average using the following equation.
x(j) = alpha * x(j) + beta * x(j-1),
for all j>=2 (Eqn 10)
where alpha + beta = 1.0
Thus, you can use the first and second observation to
calculate a new value for the second observation, and so
forth. Another feature of the geometric moving average is
that you can select the degree of smoothing. If you select a
large value for alpha (that is, 0.5 <= alpha < 1.0), the
smoothed series is less sensitive to variations between
observations. Conversely, if you select a small value for
alpha (that is, 0.0 < alpha < 0.5), the smoothed series is
more responsive to variations in the historical data series.
For example, you can apply a geometric moving average to the
historical data observations shown in the previous table.
For this example, alpha equals 0.5. Hence, beta is equal to
0.5. The observation for January would be lost and the
observation for February would be computed as
FEB98 = 0.5 * 2900 + 0.5 * 3070 = 2985
The value for March would be computed based on the smoothed
observation for February and the actual value for March.
This procedure would be continued for the remainder of the
observations, resulting in the following table:
Observation GMA Test
Month Number Jobs
======= =========== ======
JAN98 1 .
FEB98 2 2985
MAR98 3 2968
APR98 4 3024
MAY98 5 3112
JUN98 6 3131
Using the smoothed observations a second linear model was
developed. The parameters for this model are shown below:
n = 5, the number of historical observations
b = 2869, the y intercept
m = 43.6, the slope of the line
2
r = 0.88, the coefficient of determination
F = 23.36, the F value
p = 0.02, the probability that we should reject the
hypothesis
s = 29.6, the standard error
e
The following table shows the predicted and residual values
developed using this model:
Obs GMA Test Test Est. Residual
Month # Jobs Jobs Jobs (error)
======= === ====== ====== ====== ========
JAN98 1 . 2900 2914 .
FEB98 2 2985 3070 2957 28
MAR98 3 2968 2950 3000 -32
APR98 4 3024 3080 3043 -19
MAR98 5 3112 3200 3087 25
JUN98 6 3131 3150 3131 0
The forecast developed using the smoothed data series is much
better behaved than the first forecast that was based on the
untreated historical data series. Data smoothing is a
powerful technique that you can use to minimize the effects
of apparently random variations in historical data.
JOB COUNTS | | | | | 3200 + * 3190 + 3180 + 3170 + 3160 + 3150 + * 3140 + 3130 + 3120 + 3110 + 3100 + 3090 + 3080 + * J 3070 + * O 3060 + B 3050 + S 3040 + 3030 + 3020 + 3010 + 3000 + 2990 + 2980 + 2970 + 2960 + 2950 + * 2940 + 2930 + 2920 + 2910 + 2900 + * | | | | ---+------------------+------------------+------------------+------------------+------------------+-- 1 2 3 4 5 6 OBSERVATION NUMBER
Figure 7-5. Monthly Job Counts
|
Copyright © 2014 CA.
All rights reserved.
|
|