The database administrator needs to plan for the database recovery in case of system failures. Failures can be of many types such as application crashes, hardware errors, power failures, etc.
Some simple approaches to database recovery are as follows −
Make periodical backup copies of important datasets so that all transactions posted against the datasets are retained.
If a dataset is damaged due to a system failure, that problem is corrected by restoring the backup copy. Then the accumulated transactions are re-posted to the backup copy to bring them up-to-date.
The disadvantages of simple approach to database recovery are as follows −
Re-posting the accumulated transactions consumes a lot of time.
All other applications need to wait for execution until the recovery is finished.
Database recovery is lengthier than file recovery, if logical and secondary index relationships are involved.
A DL/I program crashes in a way that is different from the way a standard program crashes because a standard program is executed directly by the operating system, while a DL/I program is not. By employing an abnormal termination routine, the system interferes so that recovery can be done after the ABnormal END (ABEND). The abnormal termination routine performs the following actions −
The limitation of this routine is that it does not ensure if the data in use is accurate or not.
When an application program ABENDs, it is necessary to revert the changes done by the application program, correct the error, and re-run the application program. To do this, it is required to have the DL/I log. Here are the key points about DL/I logging −
A DL/I records all the changes made by an application program in a file which is known as the log file.
When the application program changes a segment, its before image and after images are created by the DL/I.
These segment images can be used to restore the segments, in case the application program crashes.
DL/I uses a technique called write-ahead logging to record database changes. With write-ahead logging, a database change is written to the log dataset before it is written to the actual dataset.
As the log is always ahead of the database, the recovery utilities can determine the status of any database change.
When the program executes a call to change a database segment, the DL/I takes care of its logging part.
The two approaches of database recovery are −
Forward Recovery − DL/I uses the log file to store the change data. The accumulated transactions are re-posted using this log file.
Backward Recovery − Backward recovery is also known as backout recovery. The log records for the program are read backwards and their effects are reversed in the database. When the backout is complete, the databases are in the same state as they were in before the failure, assuming that no another application program altered the database in the meantime.
A checkpoint is a stage where the database changes done by the application program are considered complete and accurate. Listed below are the points to note about a checkpoint −
Database changes made before the most recent checkpoint are not reversed by backward recovery.
Database changes logged after the most recent checkpoint are not applied to an image copy of the database during forward recovery.
Using checkpoint method, the database is restored to its condition at the most recent checkpoint when the recovery process completes.
The default for batch programs is that the checkpoint is the beginning of the program.
A checkpoint can be established using a checkpoint call (CHKP).
A checkpoint call causes a checkpoint record to be written on the DL/I log.
Shown below is the syntax of a CHKP call −
CALL 'CBLTDLI' USING DLI-CHKP PCB-NAME CHECKPOINT-ID
There are two checkpoint methods −
Basic Checkpointing − It allows the programmer to issue checkpoint calls that the DL/I recovery utilities use during recovery processing.
Symbolic Checkpointing − It is an advanced form of checkpointing that is used in combination with the extended restart facility. Symbolic checkpointing and extended restart together let the application programmer code the programs so that they can resume processing at the point just after the checkpoint.