2. Planning for Installation and Use of CA MICS › 2.3 Installation Planning and Parameter Specification › 2.3.2 Database Unit Planning and Parameters › 2.3.2.1 An Introduction to the Concept of Summarization
2.3.2.1 An Introduction to the Concept of Summarization
A number of the members of the MICS.PARMS library and the
parameters therein exist to give you control over the CA MICS
process known as "summarization". Very simply stated,
summarization is the process of taking a lot of data and
reducing it to less data, without losing information required
for analysis and reporting purposes. For a given CA MICS
file, many records in the DETAIL timespan are consolidated
into fewer records in the DAYS timespan. Then those fewer
records are consolidated into yet fewer records in the WEEKS
and MONTHS timespans which are consolidated into fewer still
records in the YEARS timespan.
Summarization is a necessity given the volume of measurement
data handled by the CA MICS system. No shop has the DASD
space to keep DETAIL-level data online indefinitely, nor
would it have the machine resources to process it if the data
could be kept.
Summarization is done by "key". In each CA MICS timespan
(DETAIL, DAYS, etc.), each CA MICS file (e.g., BATJOB,
SCPPGA) is in sequence by the values of a number of its
variables. Taken together, these variables are the key of
the file.
For example, the BATJOB file in the DETAIL timespan is
sequenced by the values of its SYSID, ACCTNOs (account number
fields), JOBGROUP, JOB (the jobname), YEAR, MONTH, DAY, and
ENDTS (ending time-stamp) variables. As in all CA MICS
files, there will exist one record in the file for each
unique combination of key variable values, and there will be
no duplicate keys. Thus at the DETAIL level there must be
one record for each job run on the system during the time
covered by the file (the ENDTS variable separates executions
of jobs with the same JOBname).
At the MONTHS level, the "key" of the BATJOB file is SYSID,
ACCTNO(s) (the accounting fields), JOBGROUP, YEAR, MONTH, and
ZONE (zone is a concept similar to "shift"). When a CYCLE in
the MONTHS level of the BATJOB file is created, there will
be one and only one record included for each unique
combination of these variables. As CA MICS never creates a
record unless data is actually encountered, at least one
job's information will be in each record, and, of course,
summarization would not be serving its purpose if, on the
average, there were not a good number of jobs consolidated
into each record at the MONTHS level.
Setting aside, for the moment, the problem of how information
from many records is put into one, we may observe that the
importance of certain key fields varies between the DETAIL
and MONTHS timespans. At the DETAIL level, the account
numbers and the JOBGROUP are not a significant part of the
key because there would be the same number of records in the
same order containing the same information even if these
variables were not part of the key--one for each unique
execution of a job. At the MONTHS level, on the other hand,
the account numbers and the JOBGROUP are very important parts
of the key. JOB(name) and ENDTS have disappeared, and
therefore the account numbers and the JOBGROUP mainly
determine what information will be present in each CYCLE of
the file and in what order.
To give a concrete example of the use of these fields,
suppose in your organization there are three programming
groups, each with its own manager. Hearing of the
capabilities of the new (CA MICS) system you have installed,
you are asked to create an inquiry transaction which any
manager may run at any time to show, for each of his/her
programmers, how much it cost to run the batch jobs they
submitted for each project on which they worked during that
month so far. Further assume that at your site the
programmers identify the project on whose behalf a job is run
by the first two characters of the JOBname they use. It is
clear that if the input to your report were CA MICS DETAIL
level data, you would not much care what the accounting
fields or JOBGROUP were set to: you have the JOBnames and
programmer name fields for every single job run during the
month. It is also clear that the managers would not much
care for the response time of the inquiry transaction you
gave them -- it would have to pass a record for every batch
job run in the entire month!
This is where summarization comes to your aid if you have set
up your CA MICS installation correctly. If, anticipating
such a request, you set up CA MICS so that the project
identifier and programmer identifier were carried in the
account number fields, then you could supply your management
with an inquiry which runs off of the (much smaller) MONTHS
BATJOB file (CYCLE 00 for month-to-date).
How does CA MICS know to assign the right values to variables
like the account numbers of the BATJOB file so you can meet
your reporting requirements? It doesn't. You must tell how
this is to be accomplished via parameters at CA MICS
generation time. In the specific case of the batch job
(BATJOB) file, the relevant parameters are stored in
MICS.PARMS(ACCOUNT) and MICS.PARMS(ACCTRTE), but there are
similar processes which must go on for all CA MICS files
which deal with individual units of work. For example, the
corresponding members for the IMS component are IMSACCT and
IMSACRT.
Remember as you go through the many parameters in this
chapter which influence the CA MICS process called
summarization that your task is to tell CA MICS a way to
consolidate your data in such a way that you can meet your
reporting and analytical requirements without keeping so much
data that you cannot afford to have it online.