4. Operation › 4.3 Operations Reference › 4.3.7 CA MICS Checkpoint File › 4.3.7.3 Duplicate Data Check Process
4.3.7.3 Duplicate Data Check Process
CA MICS provides two forms of duplicate data protection
checks:
o Duplicate data within the same input stream is
automatically dropped by each of the respective product's
update processors. This process does not involve the
checkpoint file.
o Duplicate input data occurs when data that has previously
been input into the database update is again
inadvertently input to the update process. If the input
data falls within the time-range kept by ORGSYSID and
product in a checkpoint file, it is automatically
dropped, unless specific administrator action is
signalled to force the data through.
While the examples and discussion in this section refer
directly to the unit checkpoint file and the DAILY update
job, the concepts and processing description apply equally to
the individual, product specific incremental update
checkpoint data sets.
The date/time ranges maintained in checkpoint file Database
update time range records are matched by ORGSYSID and
component combination. For systems such as IMS and CICS, the
component check is extended to include a subsystem ID (e.g.,
IMSID or CICSID). Data is processed or dropped based on the
following conditions:
o If an input record timestamp is greater than the
checkpoint entry high timestamp and the date in the input
record timestamp is not greater than today's date, the
record is PROCESSED.
o If an input record timestamp is less than the checkpoint
entry high timestamp, the record is DROPPED because it is
considered duplicate data.
Data dropped in this way may be input to the update
process using the Force option.
o If the input record timestamp is less than the checkpoint
entry low timestamp, the record is DROPPED because it was
deemed to be generated before the product was installed.
Data dropped in this way may be input to the update
process using the Force option.
o If the input record timestamp date is greater than today's
date, the record is DROPPED because it is considered to
have been generated with an invalid IPL date.
A warning is written to the log when "future" data is
found. Very special care is required to bring this data
into the database. Call CA MICS Product Support for
assistance.
The following figure illustrates unit checkpoint file
processing in the DAILY Database update job steps.
+-----------------------------------------------------------+
| |
| CA MICS Daily Job Stream |
| |
| +---------------------------+ |
| | Step DAYSMF in DAILY | +-------+ |
| | +---------------------+ | | | |
| | | INPUTRDR's First |<-+---------------| | |
| | | Action: Read in the | | Read Only | | |
| | | unit checkpoint, | | | | |
| | | and build SYSID- | | | | |
| | | Component Time Range| | | | |
| | | Table. | | | | |
| | +---------------------+ | | C | |
| | | Match time-stamps | | | H | |
| | | to SYSID-Component. | | | E | |
| | | Drop data not | | | C | |
| | | passing range tests.| | | K | |
| | +---------------------+ | | P | |
| | | | O | |
| | | | I | |
| | | | N | |
| | Update Steps in Daily | | T | |
| | +---------------------+ | | | |
| | | Build SYSID/ |<-----------------| F | |
| | | Component Time Range| | Read Only | I | |
| | | Table. | | | L | |
| | +---------------------+ | | E | |
| | | Update entries in | | | | |
| | | the Time Range | | | | |
| | | Table. | | | | |
| | +---------------------+ | | | |
| | | Write entire SYSID | | | | |
| | | Component Table | | | | |
| | | to unit checkpoint | | | | |
| | | with new update |<---------------->| | |
| | | time ranges. | |Update In-place| | |
| | +---------------------+ | +-------+ |
| +---------------------------+ |
+-----------------------------------------------------------+
The following example illustrates the duplicate data check
process.
Sample unit checkpoint Database update time range records
for a system running one processor with the CA MICS SMF, RMF,
TSO, and DB2 products:
1...5...10...15...20...25...30...35...40...45...50...55
P033 RMF 01JAN01:00:00:10.03 27JAN01:23:59:46.00
P033 SMF 01JAN01:00:00:02.05 27JAN01:23:30:23.00
P033 TSO 01JAN01:00:00:01.11 27JAN01:23:58:12.00
P033 DB2 PROD 01JAN01:00:07:00.28 27JAN01:19:38:00.00
Using the above update time ranges, the CA MICS routine that
first reads the data source matches each record read against
its respective ORGSYSID and product time range to determine
if the record should be selected or dropped. This process is
performed by the INPUTRDR step (DAYSMF) and/or each of the
update processor steps (001-190).
The selection process adheres to the following rules:
1. Each record is matched with its corresponding ORGSYSID
and product in the table. If a new ORGSYSID-product
is encountered, it is automatically added to the table.
In this example, the ORGSYSID/product match is extended
to include CICSID for the CICS product (PROD in this
example).
2. If the input record's timestamp is greater than the
checkpoint high timestamp and NOT GREATER THAN TODAY'S
DATE, the record is selected for further processing.
In this example (assuming today is January 28, 2001),
SMF product input data records generated today on
January 28 would be accepted. In addition, data
generated at 11:45 p.m. on January 27 would be
processed because 27JAN01:23:45:00.00 (the input record
timestamp) is greater than the checkpoint entry high
timestamp (27JAN01:23:30:23.00).
3. For some CA MICS components that must routinely process
several SMF records with the same timestamp because the
1/100th of a second granularity is too large (for
example SMF records for DB2), if the input record's
timestamp is equal to the checkpoint high timestamp and
NOT GREATER THAN TODAY'S DATE, the record is selected
for further processing.
In this example (assuming today is January 28, 2001),
SMF product input data records generated today on
January 28 for DB2 would be accepted. In addition,
data generated at 7:38 p.m. on January 27 would be
processed because 27JAN01:19:38:00.00 (the input record
timestamp) is equal to the checkpoint entry high
timestamp (27JAN01:19:38:00.00).
After the 27JAN01:19:38:00.00 record is selected, the
"further processing" ensures that no duplicate data is
retained.
4. If the SELECT option was specified and the input
record's timestamp is outside of the SELECT range, the
record is NOT selected for further processing.
5. Records that are not selected by the tests above are
dropped. Note that the Force option can be used to
allow processing of records dropped by the duplicate
data check.