Previous Topic: 4.3.7.2 Checkpoint Database Update Time Range Records

Next Topic: 4.3.7.4 Using the Select Option for Input Data

4.3.7.3 Duplicate Data Check Process



CA MICS provides two forms of duplicate data protection
checks:

o  Duplicate data within the same input stream is
   automatically dropped by each of the respective product's
   update processors.  This process does not involve the
   checkpoint file.

o  Duplicate input data occurs when data that has previously
   been input into the database update is again
   inadvertently input to the update process.  If the input
   data falls within the time-range kept by ORGSYSID and
   product in a checkpoint file, it is automatically
   dropped, unless specific administrator action is
   signalled to force the data through.

While the examples and discussion in this section refer
directly to the unit checkpoint file and the DAILY update
job, the concepts and processing description apply equally to
the individual, product specific incremental update
checkpoint data sets.

The date/time ranges maintained in checkpoint file Database
update time range records are matched by ORGSYSID and
component combination.  For systems such as IMS and CICS, the
component check is extended to include a subsystem ID (e.g.,
IMSID or CICSID).  Data is processed or dropped based on the
following conditions:

o  If an input record timestamp is greater than the
   checkpoint entry high timestamp and the date in the input
   record timestamp is not greater than today's date, the
   record is PROCESSED.

o  If an input record timestamp is less than the checkpoint
   entry high timestamp, the record is DROPPED because it is
   considered duplicate data.

   Data dropped in this way may be input to the update
   process using the Force option.

o  If the input record timestamp is less than the checkpoint
   entry low timestamp, the record is DROPPED because it was
   deemed to be generated before the product was installed.

   Data dropped in this way may be input to the update
   process using the Force option.

o  If the input record timestamp date is greater than today's
   date, the record is DROPPED because it is considered to
   have been generated with an invalid IPL date.

   A warning is written to the log when "future" data is
   found.  Very special care is required to bring this data
   into the database.  Call CA MICS Product Support for
   assistance.

The following figure illustrates unit checkpoint file
processing in the DAILY Database update job steps.

+-----------------------------------------------------------+
|                                                           |
|  CA MICS Daily Job Stream                                 |
|                                                           |
|  +---------------------------+                            |
|  |   Step DAYSMF in DAILY    |               +-------+    |
|  |  +---------------------+  |               |       |    |
|  |  | INPUTRDR's First    |<-+---------------|       |    |
|  |  | Action: Read in the |  |  Read Only    |       |    |
|  |  | unit checkpoint,    |  |               |       |    |
|  |  | and build SYSID-    |  |               |       |    |
|  |  | Component Time Range|  |               |       |    |
|  |  | Table.              |  |               |       |    |
|  |  +---------------------+  |               |   C   |    |
|  |  | Match time-stamps   |  |               |   H   |    |
|  |  | to SYSID-Component. |  |               |   E   |    |
|  |  | Drop data not       |  |               |   C   |    |
|  |  | passing range tests.|  |               |   K   |    |
|  |  +---------------------+  |               |   P   |    |
|  |                           |               |   O   |    |
|  |                           |               |   I   |    |
|  |                           |               |   N   |    |
|  |   Update Steps in Daily   |               |   T   |    |
|  |  +---------------------+  |               |       |    |
|  |  | Build SYSID/        |<-----------------|   F   |    |
|  |  | Component Time Range|  |  Read Only    |   I   |    |
|  |  | Table.              |  |               |   L   |    |
|  |  +---------------------+  |               |   E   |    |
|  |  | Update entries in   |  |               |       |    |
|  |  | the Time Range      |  |               |       |    |
|  |  | Table.              |  |               |       |    |
|  |  +---------------------+  |               |       |    |
|  |  | Write entire SYSID  |  |               |       |    |
|  |  | Component Table     |  |               |       |    |
|  |  | to unit checkpoint  |  |               |       |    |
|  |  | with new update     |<---------------->|       |    |
|  |  | time ranges.        |  |Update In-place|       |    |
|  |  +---------------------+  |               +-------+    |
|  +---------------------------+                            |
+-----------------------------------------------------------+

The following example illustrates the duplicate data check
process.

Sample unit checkpoint Database update time range records
for a system running one processor with the CA MICS SMF, RMF,
TSO, and DB2 products:

  1...5...10...15...20...25...30...35...40...45...50...55
  P033 RMF      01JAN01:00:00:10.03 27JAN01:23:59:46.00
  P033 SMF      01JAN01:00:00:02.05 27JAN01:23:30:23.00
  P033 TSO      01JAN01:00:00:01.11 27JAN01:23:58:12.00
  P033 DB2 PROD 01JAN01:00:07:00.28 27JAN01:19:38:00.00

Using the above update time ranges, the CA MICS routine that
first reads the data source matches each record read against
its respective ORGSYSID and product time range to determine
if the record should be selected or dropped.  This process is
performed by the INPUTRDR step (DAYSMF) and/or each of the
update processor steps (001-190).

The selection process adheres to the following rules:

  1.  Each record is matched with its corresponding ORGSYSID
      and product in the table.  If a new ORGSYSID-product
      is encountered, it is automatically added to the table.

      In this example, the ORGSYSID/product match is extended
      to include CICSID for the CICS product (PROD in this
      example).

  2.  If the input record's timestamp is greater than the
      checkpoint high timestamp and NOT GREATER THAN TODAY'S
      DATE, the record is selected for further processing.

      In this example (assuming today is January 28, 2001),
      SMF product input data records generated today on
      January 28 would be accepted.  In addition, data
      generated at 11:45 p.m. on January 27 would be
      processed because 27JAN01:23:45:00.00 (the input record
      timestamp) is greater than the checkpoint entry high
      timestamp (27JAN01:23:30:23.00).

  3.  For some CA MICS components that must routinely process
      several SMF records with the same timestamp because the
      1/100th of a second granularity is too large (for
      example SMF records for DB2), if the input record's
      timestamp is equal to the checkpoint high timestamp and
      NOT GREATER THAN TODAY'S DATE, the record is selected
      for further processing.

      In this example (assuming today is January 28, 2001),
      SMF product input data records generated today on
      January 28 for DB2 would be accepted.  In addition,
      data generated at 7:38 p.m. on January 27 would be
      processed because 27JAN01:19:38:00.00 (the input record
      timestamp) is equal to the checkpoint entry high
      timestamp (27JAN01:19:38:00.00).

      After the 27JAN01:19:38:00.00 record is selected, the
      "further processing" ensures that no duplicate data is
      retained.

  4.  If the SELECT option was specified and the input
      record's timestamp is outside of the SELECT range, the
      record is NOT selected for further processing.

  5.  Records that are not selected by the tests above are
      dropped.  Note that the Force option can be used to
      allow processing of records dropped by the duplicate
      data check.