Step function changes in the workload or other known changes
can often introduce problems into linear regression
forecasts. The following table illustrates this type of
problem by presenting monthly observations of the average
number of jobs per hour submitted by an engineering group.
In April, three more engineers were added to the group,
resulting in a step function increase of approximately 10
jobs per hour.
Observation Jobs
Month Number Hour
======= =========== ======
JAN98 1 19
FEB98 2 25
MAR98 3 23
APR98 4 34
MAY98 5 37
JUN98 6 40
Figure 7-7 shows a scatter plot of the data. A linear
regression model developed for this historical data series
has the following parameters:
n = 6, the number of historical observations
b = 14.5, the y intercept
m = 4.3, the slope of the line
2
r = 0.92, the coefficient of determination
F = 5.1, the F value
p = .003, the probability that we should reject the
hypothesis
s = 2.7, the standard error
e
The following table shows the predicted and residual values
for the historical data series:
Observation Jobs Est. Residual
Month Number Hour Jobs (Error)
======= =========== ====== ====== ========
JAN98 1 19 18.8 0.2
FEB98 2 25 23.2 1.8
MAR98 3 23 27.5 4.5
APR98 4 34 31.8 2.7
MAY98 5 37 36.2 -6.9
JUN98 6 40 40.5 -21.5
Although the r-squared of 0.92 for the model is very good,
the results are somewhat suspicious due to the step function
increase that occurred in the workload. You might want to
modify the historical series to account for the step function
increase of 10 jobs per hour in April. There are two ways to
approach the problem:
o Subtracting 10 from the April through June observations.
o Adding 10 to the January through March observations.
You would probably want to modify the January through March
observations since you would expect the future observations
to be more like the April through June sales. Therefore, you
would transform the historical observations using the
following equation:
x(j) = x(j)+10, for j<=3 (Eqn 13)
x(j) = x(j), for j>3
Univariate Model Forecasting allows you to specify SAS
statements to accomplish these transformations. The
following table shows the result of this transformation:
Observation Transformed
Month Number Jobs
======= =========== ======
JAN98 1 29
FEB98 2 35
MAR98 3 33
APR98 4 34
MAY98 5 37
JUN98 6 40
Using the transformed observations, a second linear model was
developed. The parameters of this model are shown below:
n = 6, the number of historical observations
b = 28.5, the y intercept
m = 1.77, the slope of the line
2
r = 0.79, the coefficient of determination
F = 15.23, the F value
p = 0.01, the probability that the hypothesis should
be rejected
s = 1.92, the standard error
e
The following table shows the predicted and residual values
developed using this model:
Est
Obs Jobs Trans. Jobs Residual
Month # Hour Jobs Hour (error)
======= === ====== ====== ===== ========
JAN98 1 19 29 30.2 -1.2
FEB98 2 25 35 32.0 3.0
MAR98 3 23 33 33.8 -0.8
APR98 4 34 34 35.6 -1.6
MAY98 5 37 37 37.3 -0.3
JUN98 6 40 40 39.1 0.9
Note that the r-squared and F values for the resulting model,
while acceptable, are poorer than the values calculated for
the original historical data series. However, the
statistical model proposed is much more reasonable since it
accounts for a significant step that occurred in the data.
You should always examine each model to determine whether the
proposed model is reasonable for the data being modeled.
ENGINEERING JOBS PER HOUR | | 40.0 + * | | | 37.5 + | * | | J 35.0 + O | B | * S | 32.5 + / | | H | O 30.0 + U | R | | 27.5 + | | | 25.0 + * | | | * 22.5 + | | | 20.0 + | | * | 17.5 + | ---+------------------+------------------+------------------+------------------+------------------+-- 1 2 3 4 5 6 OBSERVATION NUMBER
Figure 7-7. Engineering Jobs Per Hour
|
Copyright © 2014 CA.
All rights reserved.
|
|