Periodic Diagnostics and Automatic Error Log Analysis are provided by the diagnostics.
Periodic diagnosis of the disk drives and battery are enabled by default. The disk diagnostics will perform disk error log analysis on all disks. The battery diagnostics will test the real time clock and NV-RAM battery.
Periodic Diagnostics are performed in different ways depending on the diagnostic version.
Periodic diagnosis of the disk drives and battery are performed by a root crontab entry. One entry in the root crontab table runs disk diagnostics at 3:01 a.m. each day. Another entry runs the battery diagnostics at 4:01 a.m. each day.
The two diagnostics can be disabled by editing the root crontab file. The disk entry is /etc/lpp/diagnostics/bin/run_ela. The battery entry is /etc/lpp/diagnostics/bin/test_batt. Problems are reported by a message to the system console and logged in the error log. Diagnostics must be run for an SRN to be reported.
Running diagnostics in this mode is similar to using the diag -c -e -d device command.
Periodic diagnostics in AIX Version 4 is controlled by the Periodic Diagnostic Service Aid. The Periodic Diagnostic Service Aid allows error log analysis to be run on hardware resources once a day. By default, the battery and all disk drives are enabled to run. The battery diagnostic is run at 4:00 a.m. each day, and error log analysis is performed on all the disk drives at 3:00 a.m. each day. Other devices can be added to the Periodic Diagnostic Device list, and error log analysis can be directed to run at different times.
Problems are reported by a message to the system console, and a mail message is sent to all members of the system group. The message contains the SRN.
Running diagnostics in this mode for base system devices is similar to using the diag -c -d device command. All other devices are invoked with the -e flag appended.
Automatic Error Log Analysis (diagela) provides the capability to do error log analysis whenever a permanent hardware error is logged. Automatic Error Log Analysis is disabled by default. Whenever a permanent hardware resource error is logged and the diagela program is enabled, the diagela program is invoked.
The diagela program determines whether the error should be analyzed by the diagnostics. If the error should be analyzed, a diagnostic application will be invoked and the error will be analyzed. No testing is done. If the diagnostics determines that the error requires a service action, it sends a message to your console and to all system groups. The message contains the SRN.
Running diagnostics in this mode is similar to usign the diag -c -e -d device command.
Notification can also be customized by adding a stanza to the PDiagAtt object class. The following example illustrates how a customer's program can be invoked in place of the normal mail message:
PDiagAtt:
DType = "" DSClass = "" attribute = "diag_notify" value = "/usr/bin/customer_notify_program $1 $2 $3 $4" rep = "s"
Once the above stanza is added to the ODM data base, problems will be displayed on the system console and the program specified in the value field of the diag_notify pre-defined attribute will be invoked. The following keywords will be expanded automatically as arguments to the notify program:
$1 | the keyword "diag_notify" |
$2 | the resource name that has the problem |
$3 | the Service Request Number |
$4 | the device type |
To activate the Automatic Error Log Analysis feature, log in as root and type the following command:
/usr/lpp/diagnostics/bin/diagela ENABLE
To disable the Automatic Error Log Analysis feature, log in as root and type the following command:
/usr/lpp/diagnostics/bin/diagela DISABLE
In AIX Version 4, diagela can also be enabled and disabled using the Periodic Diagnostic Service Aid.