What an I/O Error Means
An I/O error occurring on a database file indicates that an error occurred trying to read or write to the file. This may be caused by hardware malfunctions such as a channel problem, which if corrected, means that no recovery operation is needed. An I/O error can also be caused by a physically damaged file or disk device; this type of error requires recovery of the file.
Identifying a Database File I/O Error
When CA IDMS/DB encounters an I/O error in a database file, the following events occur:
If Recovery is Successful
If the recovery process is successful, CA IDMS/DB continues processing. To fix the I/O error, you must follow these steps:
|
Action |
Statement |
|---|---|
|
Take the area(s) associated with the bad database file offline |
DCMT VARY AREA with the OFFLINE option |
|
Identify the problem and fix it. If the problem is not associated with the database file itself (for example, the problem is due to a bad channel), perform step 3 after the problem is corrected; if the problem is due to a damaged file, perform the steps outlined for an unsuccessful recovery. |
|
|
Bring the area(s) associated with the database file online |
DCMT VARY AREA with the ONLINE option |
If the Recovery is Unsuccessful
If the recovery process is unsuccessful, CA IDMS/DB suspends the transaction and issues the following message:
DC205009 TRANSACTION SUSPENDED. TRANSACTION ID: transaction-id
When CA IDMS/DB issues this message, quiesce the area in which the problem occurred as quickly as possible to prevent additional transactions from readying the area. The following table identifies all the steps:
|
Action |
Statement |
|---|---|
|
Quiesce the affected area (see Considerations in this section) |
DCMT VARY AREA with the TRANSIENT RETRIEVAL or OFFLINE options |
|
Switch to a new journal file |
DCMT VARY JOURNAL |
|
De-allocate the file |
DCMT VARY FILE with the DEALLOCATE option; use the FORCE option if the file cannot be closed (for example, because of a channel problem) |
|
Restore a copy of the damaged file using the last backup tape as input. If the FORCE option was used in step 3, recreate the file with a new name |
RESTORE with the FILE option |
|
Rollforward the restored copy of the file using the archive journal files in the order they were created |
Various. See 21.5, “Manual Recovery" |
|
If the file was restored to a new location:
|
Operating system facilities |
|
If the file was renamed in z/OS or z/VM, change its dataset name |
DCMT VARY FILE with the DSNAME option |
|
Make the new file available to the central version |
DCMT VARY FILE with the ALLOCATE option |
|
Re-activate the suspended transactions so they complete automatic recovery |
DCMT VARY FILE with the ACTIVE option |
|
Re-activate the area for update processing |
|
Considerations
Quiescing the Area
Quiesce the area by varying it offline or retrieval. The differences are as follows:
If the area to be recovered is a system area, it may be necessary to terminate predefined system run units by issuing a DCMT VARY RUN UNIT ... OFFLINE command to quiesce activity to the area. It is advisable to vary the status of a system area to transient retrieval rather than offline.
In a data sharing environment, it is important to quiesce a shared area in all members of the data sharing group. The broadcast capability of DCMT commands can be used to do this easily.
Renaming the File
If you restored the file under a new name, you must make sure that the correct file is used the next time the system is started. If change tracking is in effect for the DC/UCF system, CA IDMS automatically ensures that the correct file is used when the system is restarted following an abnormal termination. However, if change tracking is not in use or if you shut down the system, you must do one of the following:
If you fail to do one of the above, CA IDMS/DB will attempt to access the wrong file the next time the system is started. This may have serious consequences if the original file still exists.
More Information
Use of Deallocate Force
If the damaged file was de-allocated using the FORCE option, the DC/UCF system marks the file as closed and de-allocated but does not actually issue the corresponding operating system requests. For this reason, you must restore the file under a different dataset name. When the DC/UCF system is eventually shutdown, it will not shutdown successfully because the operating system will attempt to close the original file. This will either cause an abend or the DC/UCF system will hang. In either case, examine the messages produced on the log. If the following message appears, the database system has completed processing and no additional action is required:
DC200010 CA IDMS/DB Inactive
If this message does not appear, you should restart the system (after taking appropriate steps such as renaming the file) and then shut it down.
Correcting the Lock Option of an Area and File
If the area associated with a damaged database file is in retrieval mode or offline and the file was restored with the area lock on, then the area status is incompatible with the file status. If you try to vary the area online, IDMS responds with an error. To correct this situation, issue a DCMT VARY AREA command with the UPDATE LOCKED option. This command allows IDMS to vary the area to an update mode even though the file is locked.
InDoubt Transaction Considerations
No special action regarding InDoubt transactions should be necessary, since they will complete once the file is varied active and resynchronization takes place with the coordinator.
|
Copyright © 2014 CA.
All rights reserved.
|
|