Previous Topic: 3.1.1 System Reliability Summary ReportNext Topic: 3.1.3 Module Failure Trend Analysis Report


3.1.2 System Software Malfunction Summary Report


The System Software Malfunction Summary Report summarizes the
software failures occurring on each system by both system and
user completion code.  The objective of the report is to
present a consolidated list of the software errors by system
and user abend code for each system.

Software failure data is summarized by job name, module name,
CSECT name, and functional recovery routine name.  An
indication is provided if the job or module has been
designated as critical to the installation.  This indication
is set based on what is specified in SRLOPS for the CMOD
control statement.

See Section 7.3.2 for more information on the CMOD control
statement.

In reviewing the System Software Malfunctions Summary Report,
you must pay particular attention to failures which occur
more than one time and to failures in jobs and modules known
to be critical to the operation of the system.

The following information is provided in the System Software
Malfunction Summary Report:

     SYSTEM -  the unique name of the system (SYSID) on
               which the error occurred.

     CPU SERIAL - the CPU or processor serial number being
               used when the error occurred.  This is
               provided to assist in isolating a failure that
               may be related to the processor being used.
               In some cases, a software system may be used
               on more than one CPU or processor.

     JOBNAME - the name of the job being executed when the
               error occurred.  The jobname may be blank if
               the error occurs in the system software and no
               specific job can be associated with the
               failure.  Jobs known to be critical to the
               system or to a critical application must be
               reviewed in more detail.  Critical jobs may
               have already been specified to CA MICS and, if
               so, are indicated by '***' under CRITICAL JOB.

     CRITICAL JOB/MOD - indicates whether the job and/or
               module has been identified as being critical
               to the installation.  The indication '***' may
               appear for either or both jobname and module
               name.

     MODULE NAME - the name of the module or routine being
               executed at the time of the failure.  Modules
               known to be critical to the system or to a
               critical application should be reviewed in
               more detail.  Critical modules may have
               already been specified to CA MICS and, if so,
               are indicated by an * under CRITICAL MODULE.

               See Section 7.3.2 for information on how to
               define a module as critical to CA MICS.

     CSECT NAME - the name of the control section (CSECT)
               being executed at the time of the failure.

     RECOVERY ROUTINE - the name of the functional recovery
               routine (FRR) used to assist in the recovery
               of the failure.

     COMPLETION CODE - provides the system and user
               completion or abend code for the software
               failure.  The system code generally indicates
               a failure occurred that was detected by the
               operating system or the processor hardware.
               The system abnormally ended the job or module
               being executed at the time of the error.  For
               a user code, the job or module being executed
               determined that an error had occurred and
               requested that the operating system terminate
               its processing.

               Important completion codes can be determined
               from the text under DESCRIPTION.  Examples
               are:

               222 for jobs cancelled by the operators
               913 for security violations

     DESCRIPTION - provides a text translation of the system
               completion code.  For a user code, the
               description will indicate that the program
               issued a user abend.

     FAILURE COUNT - counts the number of times that the same
               failure occurred in a combination of the job
               name, CSECT name, and recovery routine name.
               The count is a key indication that a recurring
               failure is occurring on the system.  Any large
               number of failures must be reviewed.

The standards and procedures in your installation play an
important role in interpreting the data on the report.  The
job naming convention, for example, may let you recognize
jobs which are important to the system and, similarly, to
those that are not important.

INQUIRY ID:

     SRLLD2

DATA SOURCE (file/timespan):

     SRLSSM at the DETAIL timespan.

DATA ELEMENTS USED:

The data elements used for this inquiry are:

___________________________________________________________
|          |                                               |
|  FILE    |               DATA ELEMENTS                   |
|__________|_______________________________________________|
|          |                                               |
|  SRLSSM  | SSMJOB SSMCSECT SSMFRRTN SSMSCMPC SSMUCMPC    |
|          | SSMUCMPC SSMCMOD  SSMFCT                      |
|          |                                               |
|__________|_______________________________________________|


CA PAGE 1 | CA MICS I/S MANAGEMENT SUPPORT SYSTEM | | SYSTEM (S008) SYSTEM SOFTWARE MALFUNCTION SUMMARY REPORT DATE: MON, MAY 5, 2008| RUN DATE: SUN, MAY 4, 2008| ----------------------------------------------------------------------------------------------------------------------------------+ CPU | |CRITICAL| MODULE | CSECT | RECOVERY |COMPLETION CODE| | FAILURE | SERIAL | JOBNAME |JOB MOD| NAME | NAME | ROUTINE | SYSTEM USER | DESCRIPTION | COUNT | --------+----------+--------+----------+----------+----------+---------------+------------------------------------------+---------+ 029604 | AML585E3 | ** | IEFW21SD | IEFAB4E8 | | 222 - | SYSTEM OPERATOR CANCELLED JOB/SESSION | 1 | 029604 | DWT11941 | ** | IEFW21SD | IEFDB402 | | 138 - | ERROR DURING EXECUTION OF ENQ MACRO | 1 | 029604 | EGX929 | | IGC0002F | IGC0002F | | 33E - | DETACH MACRO ISSUED/SUBTASK NOT COMPLETE | 1 | 029604 | EGX929 | | IGC0013I | ICVDSD03 | ICVCME02 | 33E - | DETACH MACRO ISSUED/SUBTASK NOT COMPLETE | 1 | 029604 | INIT | | | | | 213 - | ERROR DURING DIRECT ACCESS OPEN MACRO | 1 | 029604 | INIT | ** | IEFW21SD | IEFAB4DD | | 0B0 - | UNCORRECTABLE ERROR DETECTED BY SWA MGR | 1 | 029604 | JES2 | ** ** | HASJES20 | | | 800 | PROGRAM ISSUED USER ABEND | 5 | 029604 | NONE-FRR | | IEAVSY50 | IGC001 | IGC002 | 402 - | ERROR DURING EXECUTION OF POST MACRO | 2 | 029604 | NONE-FRR | | IEEVSY50 | IGC001 | IGC002 | 0C4 - | VIRTUAL ADDRESS TRANSLATION EXCEPTION | 1 | 029604 | NONE-FRR | | IKTIOM03 | IKTIMLU2 | IKTIOFRR | 0AB - | VTIOC ENCOUNTERED AN ERROR FOR TSO/VTAM | 4 | 029604 | *MASTER* | ** | ILRTERMR | ILRTERMR | TERMRFRR | C0D - | ROUTINE ENCOUNTERED UNEXPECTED CONDITION | 2 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ----------------------------------------------------------------------------------------------------------------------------------+


 Figure 3-3.  System Software Malfunction Summary Report