4.3.8 System Restart and Recovery

4. Operation › 4.3 Operations Reference › 4.3.8 System Restart and Recovery
4.3.8 System Restart and Recovery


This section discusses specific topics related to restarting
CA MICS operational jobs and recovering the CA MICS database.

Refer to the Operational Processes, Jobs, and Steps section
(4.3.3) of this guide for documentation on the operational
jobs mentioned in this section.


RESTARTING CA MICS DATABASE UPDATE JOBS

The Operational Status and Tracking RESTART command handles
restart for the DAILY, WEEKLY, MONTHLY, and YEARLY jobs.  See
the Operational Status and Tracking section (4.3.4).

You can also restart the DAILY, WEEKLY, MONTHLY, and YEARLY
jobs manually by specifying the job statement RESTART=
parameter.  The CA MICS operational jobs are generated with
RESTART=* on the job statement to indicate that processing
should begin with the first step of the job.

o  Edit prefix.MICS.RESTART.CNTL if processing was submitted
   by Operational Status and Tracking or the SCHEDULE job.

   or

   Edit prefix.MICS.CNTL(jjjjjjjj), where jjjjjjjj is
   DAILY/WEEKLY/MONTHLY/YEARLY, if processing was submitted
   manually or via your installation's batch scheduling
   facility.

o  Change RESTART=* to RESTART=(stepname.MICS) on the job
   statement, where stepname is the operational step (e.g.,
   DAY030) where processing should begin.  The CA MICS Run
   Status Report lists the correct RESTART= parameter.

o  If internal step restart is enabled for the batch job
   step where processing will be restarted, then processing
   will automatically resume at the last completed
   processing phase in this job step.

o  If you need to override automatic internal step restart
   and force the step to start from the beginning, specify
   SYSPARM=NORESTART on the JCL EXEC statement for this
   batch job step.

o  Submit the batch job and CANCEL the edit session.  DO NOT
   SAVE THE MODIFIED JCL.

OVERRIDING WORK FILE DYNAMIC DATA SET ALLOCATION PARAMETERS

Internal step restart uses OS/390 dynamic allocation services
to create new data sets and to access existing data sets.
Data set allocation parameters are specified by product in
prefix.MICS.PARMS(cccOPS) and permanent changes to data set
allocation parameters (e.g., to increase the space allocation
for the WORK data set) require both changing the cccOPS
parameter and executing the corresponding cccPGEN job.
However, in restart situations, such as when recovering from
a production job abend, you may temporarily override data set
allocation parameters for one or more dynamically allocated
data sets by using the //PARMOVRD facility.

For example,

o  If the SAS log indicates that the job failed due to a
   shortage of disk space on one of the WORKnn data sets
   (where nn is 01 - 99) or the cccXWORK data set (where ccc
   is the product associated with this database update step),

   -  Edit the operational job JCL for the step that failed
      and add a //PARMOVRD DD stream containing the WORK
      and/or RESTARTWORK parameters to temporarily override
      the data set allocation parameters for the failing data
      sets to increase the space allocation.  For example,

           //PARMOVRD DD *
            WORK   SPACE=(CYL,(50,50)) STORCLAS=MICSTEMP
            RESTARTWORK SPACE=(CYL,(50,50))
            RESTARTWORK STORCLAS=MICSTEMP

   -  Restart the database update job step from the
      beginning by specifying,
          SYSPARM=NORESTART
      on the JCL EXEC statement.

   -  After the job step completes successfully, remove the
      //PARMOVRD DD stream to resume using the data set
      allocation parameters you specified in
      prefix.MICS.PARMS(cccOPS).  If you believe that the
      temporary change to the data set allocation parameters
      should be made permanent, then increase the amount of
      space requested on the cccOPS WORK (for WORKnn data
      sets) or RESTARTWORK (for the cccXWORK data set)
      parameter and run cccPGEN.

See section 2.3.6 of this guide for more information on the
//PARMOVRD facility.

RESTARTING CA MICS INCREMENTAL UPDATE JOBS

Operational Status and Tracking does NOT support the
incremental update INCRccc jobs.  You must restart INCRccc
processing manually.

The incremental update INCRccc jobs always begin with the
INCRnnn processing step, which is where you want to restart
processing.  Thus, you restart an INCRccc job by simply
resubmitting it for execution.  You can do this by editing
prefix.MICS.CNTL(INCRccc) and entering the SUBMIT command, or
you can use your installation's production batch job
scheduling facilities to resubmit the INCRccc job.

The internal step restart considerations discussed previously
apply equally to incremental update INCRccc jobs.  For
example, you can use the //PARMOVRD facility to temporarily
increase WORKnn space allocations for an INCRccc job just as
you can for a DAILY job DAYnnn step.

If you specified INCRDB TAPE or INCRDB DYNAM in
prefix.MICS.PARMS(cccOPS), then OS/390 dynamic allocation
services are used to create new IUDETAIL and IUDAYS data sets
and to access existing data sets.  Data set allocation
parameters are specified by product in
prefix.MICS.PARMS(cccOPS) and permanent changes to data set
allocation parameters (e.g., to increase the space allocation
for the IUDETAIL data set) require both changing the cccOPS
parameter and executing the corresponding cccPGEN job.
However, in restart situations, such as when recovering from
a production job abend, you may temporarily override data set
allocation parameters for one or more dynamically allocated
data sets by using the //PARMOVRD facility.

For example, you might edit the INCRccc JCL and add a
//PARMOVRD DD stream containing the INCRDETAIL and/or
INCRDAYS parameters to temporarily override the data set
allocation parameters for the failing data sets to increase
the space allocation as follows:

     //PARMOVRD DD *
      INCRDETAIL  SPACE=(CYL,(50,50)) STORCLAS=MICSDASD
      INCRDAYS    SPACE=(CYL,(10,10)) STORCLAS=MICSDASD

After the job step completes successfully, remember to remove
the //PARMOVRD DD stream to resume using the data set
allocation parameters specified in prefix.MICS.PARMS(cccOPS).
If you believe that the temporary change to the data set
allocation parameters should be made permanent, then increase
the amount of space requested on the cccOPS INCRDETAIL or
INCRDAYS parameter and run cccPGEN.

See section 2.3.6 of this guide for more information on the
//PARMOVRD facility.

NOTE:  If an INCRccc job fails with a U300 abend (less than
       10 input records), internal step restart is active for
       this product, and you plan to resolve the error
       condition through use of the checkpoint FORCE/SELECT
       facility (to force processing of data older than the
       current checkpoint high-ENDTS), then you must also
       specify the EXEC statement SYSPARM=NORESTART parameter
       in the INCRccc job JCL to re-initialize the
       incremental update checkpoint file with the unit
       checkpoint FORCE/SELECT specifications.


DAYSMF WORK FILE RECOVERY AND RESTART

The DAILY job normally uses work data sets for passing data
from the DAYSMF step to each following update processor step
which takes input from SMF.  If fewer than two CA MICS
products take input from SMF, DAYSMF work files will not
exist and DAILY will NOT have a DAYSMF step.  Refer to
section 2.3.3.2.1.1 for more information on the COMPONENTS
and SMFRECORDING keywords of prefix.MICS.PARMS(JCLDEF).

If DAYSMF work files are used, these data sets are allocated
either permanently or temporarily based on the parameters
specified in prefix.MICS.PARMS(JCLDEF) using the keyword
DAYSMF.  The DAYSMF keyword is discussed in section
2.3.3.2.1.3.

Prior to restarting the DAILY job, verify that the DAYSMF
work files are still available.  The default data set names
are

    prefix.MICS.ccc.DATA

where ccc is the component identifier (e.g., RMF).  If the
DAYSMF work files are missing, you must recreate them before
restarting the DAILY job.

The Operational Status and Tracking RESTART command gives you
the option to include DAYSMF temporary work file recovery at
the front of the generated restart job.  Simply reply YES to
the prompt on the RESTART Unit Database panel.  If the
DAYSMF work files are defined as permanent (and have been
scratched), refer to the manual restart instructions below.
See the Operational Status and Tracking section (4.3.4) of
this guide for more information.

For a manual restart, you must recover the DAYSMF work files
manually before restarting DAILY processing.  For example, if
a DAILY restart is required and the DAYSMF work files have
been scratched, use the DAYSMFR job to rebuild the DAYSMF
work files; then restart the job.  To submit the DAYSMFR job,
enter the following command:

    SUB 'prefix.MICS.CNTL(DAYSMFR)'

NOTE:  If the DAYSMF work files have been defined as
       permanent, then you must first allocate the DAYSMF
       work files as explained in Section 3.5.6.2, "CA MICS
       Database and File Allocation", before executing
       DAYSMFR.  If the DAYSMF work files have been defined
       as temporary, then the allocation will be done by
       DAYSMFR.

NOTE:  The DAYSMFR job will abort if one or more incremental
       update checkpoint files indicate that incremental or
       daily update processing is in progress or failed.  In
       this situation, you may specify,

           SYSPARM=FORCE

       on the DAYSMF step EXEC statement (in the DAYSMFR job)
       to override the incremental update checkpoint status
       check and "force" DAYSMFR job execution.


SPLITSMF OUTPUT FILE RECOVERY AND RESTART

If the INCRSPLIT USE option is specified for one or more
products in the unit database (see cccOPS), then the SPLITSMF
job is used to create tailored input data sets for the
corresponding INCRccc jobs.  The SPLITSMF job dynamically
allocates, catalogs, and populates a prefix.MICS.ccc.IUSPLTDS
data set for each product marked as INCRSPLIT USE.  The
corresponding INCRccc jobs dynamically allocate and read the
corresponding SPLITSMF output data set.  At successful end of
job, the INCRccc job dynamically deletes its input data set.

If one or more prefix.MICS.ccc.IUSPLTDS data sets are damaged
or deleted prior to INCRccc execution, the INCRccc job will
fail and you will need to rerun the SPLITSMF job to recreate
the data sets before restarting INCRccc processing.  SPLITSMF
will recreate ALL of the prefix.MICS.ccc.IUSPLTDS data sets
as follows:

o  If the data set is not found, then a new data set is
   allocated and populated.

o  If the data set already exists, it will be allocated
   DISP=MOD and the "new" input data will be added to the end
   of the existing data.

The DISP=MOD processing of an existing data set is required
to ensure that no input data is lost in cases where SPLITSMF
is executed a second time prior to INCRccc execution.  For
example, if the availability of another incremental update
input data set triggered SPLITSMF job submission before all
INCRccc jobs completed processing the prior input data set,
data would be lost if the new data simply replaced the
existing data sets.

Standard CA MICS duplicate data elimination facilities shield
the database from any duplicate data that might be introduced
by SPLITSMF appending "new" data to an existing data set as
follows:

o  If SPLITSMF is run a second time before INCRccc processing
   completes, the input file will contain duplicate data
   which will be dropped by standard SORT NODUPS processing
   which has always been used to shield the database from
   accidental inclusion of duplicate input data in a single
   DAILY database update.

o  If SPLITSMF is rerun using the same input SMF data stream
   after INCRccc processing for a product completes, then the
   next INCRccc job execution for this product will re-input
   data that has already been processed.  Standard CA MICS
   checkpoint processing will drop this duplicate data
   because it is "older" than the current checkpoint high end
   timestamp (ENDTS).

NOTE:  The SPLITSMF job will abort if one or more of the
       associated incremental update checkpoint files
       indicate that the INCRccc job is in progress or
       failed.  If an INCRccc job failed due to a missing
       input data set and you are rerunning SPLITSMF to
       recreate this missing data set, specify,

           SYSPARM=FORCE

       on the SPLITSMF job step EXEC statement to override
       the incremental update checkpoint status check and
       "force" SPLITSMF job execution.

ACCOUNTING AUDIT FILE RECOVERY AND RESTART

The DAY199 step of the DAILY operational job updates
Accounting and Chargeback month-to-date audit files.  You
must restore the Accounting and Chargeback ACTAUDIT DAY1 file
to a status synchronized with the online CA MICS database
prior to restarting the DAY199 step or after running the
RESTORE job.

If you restart DAY199 or restore the database without
restoring the ACTAUDIT DAY1 file, the ACTAUDIT files (DAY1
and DAY2) will contain duplicate data.  See the Accounting
and Chargeback User Guide for more information on the
ACTAUDIT files.

The Operational Status and Tracking RESTART command gives you
the option to include ACTAUDIT DAY1 file restore at the
front of the generated restart job.  Simply reply YES to the
prompt on the RESTART Unit Database panel.  See the
Operational Status and Tracking section (4.3.4) of this guide
for more information.

For a manual restart, you must restore the ACTAUDIT DAY1 file
manually before restarting DAY199 processing.  For example,
if a restart is required in step DAY199, use the ACTDAY1R job
to restore the ACTAUDIT DAY1 file; then restart the job.  To
submit the ACTDAY1R job, enter the following command:

    SUB 'prefix.MICS.CNTL(ACTDAY1R)'

You must submit the ACTDAY1R job manually after Database
RESTORE processing.  Refer to the Accounting and Chargeback
User Guide for more information.


RESTARTING OTHER JOBS

Restart the SPLITSMF job from the beginning.  Simply resubmit
prefix.MICS.CNTL(SPLITSMF) with the same JCL overrides (or
temporary changes).  If you are restarting SPLITSMF to
recreate a missing prefix.MICS.ccc.IUSPLTDS data set and
INCRccc failed due to the missing prefix.MICS.ccc.IUSPLTDS,
then specify SYSPARM=FORCE on the SPLITSMF step EXEC
statement to override the standard abort if an INCRccc job
failure is detected.

Restart the AUDIT job from the beginning.  Simply resubmit
prefix.MICS.CNTL(AUDIT) with the same JCL overrides (or
temporary changes).

Restart the HISTW job from the beginning.  Simply resubmit
prefix.MICS.CNTL(HISTW) with the same JCL overrides (or
temporary changes).

Restart the HISTM job from the beginning.  Simply resubmit
prefix.MICS.CNTL(HISTM) with the same JCL overrides (or
temporary changes).

Restart the RSTATUS job from the beginning.  Issue the
Operational Status and Tracking RSTATUS command again or
submit prefix.MICS.CNTL(RSTATUS).

Restart the BACKUP job from the beginning.  Issue the
Operational Status and Tracking BACKUP command again or
submit prefix.MICS.CNTL(BACKUP).

Restart the SCHEDULE job from the beginning.  Simply resubmit
prefix.MICS.CNTL(SCHEDULE) with the same JCL overrides (or
temporary changes).

Restart the RESTORE job from the beginning.  Issue the
Operational Status and Tracking RESTORE command again or
submit prefix.MICS.CNTL(RESTORE).

If RESTORE fails in the pre-restore BACKUP step, you can
bypass pre-restore backup processing.

o  Use the Operational Status and Tracking RESTORE command
   with the NOBACKUP parameter.  For example, to restore the
   P (PRIMARY) unit database without the pre-restore backup,
   enter:

      RESTORE P NOBACKUP

o  For a manual restore, edit prefix.MICS.CNTL(RESTORE) and
   specify RESTART=RSTR900.MICS on the job statement.  DO
   NOT SAVE THE MODIFIED JCL.

Refer to the Operational Processes, Jobs, and Steps section
(4.3.3) of this guide for more information on the RESTORE
operational job.

Restart the DAYSMFR job from the beginning.  Simply resubmit
prefix.MICS.CNTL(DAYSMFR) with the same JCL overrides (or
temporary changes).  If the DAYSMFR step fails with messages
indicating that one or more incremental or daily update steps
failed, then specify SYSPARM=FORCE on the DAYSMF step EXEC
statement (in the DAYSMFR job) to override the standard abort
if an incremental update failure is detected.

Restart the ACTDAY1R job from the beginning.  Simply resubmit
prefix.MICS.CNTL(ACTDAY1R) with the same JCL overrides (or
temporary changes).

Restart the RSTRTBLS job from the beginning.  Simply resubmit
prefix.MICS.CNTL(RSTRTBLS) with the same JCL overrides (or
temporary changes).

Restart the RSTRTLIB job from the beginning.  Simply resubmit
prefix.MICS.CNTL(RSTRTLIB) with the same JCL overrides (or
temporary changes).

Restart the IUDBINIT job from the beginning.  Simply resubmit
prefix.MICS.CNTL(IUDBINIT) with the same JCL overrides (or
temporary changes).

MANUAL RESTART EXAMPLE

The following is an example of manually restarting CA MICS
database update processing.

The SCHEDULE job determined that the WEEKLY process is due
and submitted a generated job for daily database update,
weekly cycle closeout, and database backup.  Processing
failed.

The CA MICS Run Status Report shows that processing failed in
step WEEK300 with an S001 abend.  Database aging is NOT in
progress, so restart is possible.  Further analysis shows an
I/O error on an archive tape.  The operations staff
determines that the problem is due to a hardware failure and
the failing tape drive is varied off the system.

You can now restart CA MICS processing.

o  Edit the SCHEDULE Restart File, which is generally named
   prefix.MICS.RESTART.CNTL.

o  Find the RESTART parameter on the job statement.

o  Change the RESTART statement to read
   RESTART=(WEEK300.MICS).

o  Submit the job from within ISPF Edit.  DO NOT SAVE THE
   MODIFIED JCL.
Tell Technical Publications how we can improve this information