This chapter provides information about recording performance data on remote systems.
Monitoring of performance data via the network is important and extremely useful if you know when and what to monitor. Unfortunately, that is not always or even normally the case. Quite commonly, performance problems arise and are felt by end users without the system administrator knowing about it until it's too late to start a monitoring session.
Therefore, the xmservd daemon permits any system with the Agent component installed to record the activity on the system at all or selected times and for any set of performance statistics. This allows a system administrator to use the activity recording for an after-the-fact analysis of the performance problems. This capability is called the xmservd recording facility and is controlled through the xmservd recording configuration file.
Whenever xmservd is configured to record the activity of the system where it's running, this prevents the daemon from dying as described in the "Life and Death of xmservd" section . The daemon considers itself to be configured for recording if a recording configuration file is present.
All recording files created by xmservd are placed in the directory /etc/perf. Recording file names are azizo.yymmdd where the part after the period is built from the day the first record was written to the file. A recording for February 26, 1994 would thus be called /etc/perf/azizo.940226. The recording activity for any one day always goes to the same file, even when xmservd is stopped and started over the day. If a recording file for the day exists when xmservd starts, it appends additional activity to that file; otherwise it creates the file. For further details about how xmservd uses recording files, see the "Retain Line" section .
Recordings produced by xmservd have one or more statsets. One is created for each sampling interval defined in the recording configuration file. Each statset is assigned a number equal to the sampling interval divided by the minimum sampling interval of the xmservd daemon.
The recording configuration file must be supplied by the system administrator who configures a host. No recording configuration file is supplied as part of Performance Toolbox for AIX. The file is in ASCII format. When xmservd starts, it first tries to locate the recording configuration file as /etc/perf/xmservd.cf. If this file doesn't exist, xmservd looks for the recording configuration file as/usr/lpp/perfagent/xmservd.cf. If either file exists, xmservd considers itself configured for recording and parses the recording configuration file for instructions about when and what to record.
The recording configuration file must contain the following lines:
The lines are described in the following sections. They must appear in the sequence shown and the keywords or metric names must begin in column one of each line. White space must separate individual entries on the lines. In addition to the required line types, the recording configuration file may contain blank lines and comment lines that begin with the character # (number sign).
A program, xmscheck, to parse and analyze a recording configuration file is supplied as part of the Agent component. This program allows you to check the validity of a recording configuration file before it is moved to the /etc/perf directory. The program is described in the "The xmscheck Preparser" section .
The primary purpose of the retain line is to specify how long time recording files must be retained. It also defines how many days each recording file covers. The format of the retain line is:
retain days_to_keep [days_per_file]
Whenever the xmservd daemon is started, and whenever it is running and midnight is passed, it checks to see if any of the recording files in directory /etc/perf is old enough to be deleted. This is done by calculating a factor, rf as the integer value:
rf = (days_to_keep + days_per_file - 1) / days_per_file
If the number d1 is the day number corresponding to the yymmdd part of the recording file name, and the current day number is d2, then the recording file is retained when the following expression is true; otherwise it is erased:
d2 - d1 rf x days_per_file
If days_per_file is larger than one, xmservd looks for a file with a name that indicates it is less than days_per_file old. If such a file exists, recording continues to that file. If not, a new file with a name generated from today's date is created.
When an existing recording file is opened by xmservd, the daemon checks the first (configuration) record in the file. This record carries the time and date of the last modification to the recording configuration file as of the time the recording file was created. If the recording configuration file has been modified since that time, xmservd begins the recording by appending a full set of control records to the file and adding the character "@" at the end of the file name. Most programs that process such a file only process the part of the file up to the second set of control records.
The frequency line sets the default sampling interval for metrics. This interval is used for all metrics for which you do not specify a different sampling interval on their metric lines. The line looks like this:
frequency interval
Recordings contain one set of statistics for each sampling interval you specify with this line type and on metric lines. It is recommended that no set of statistics ever has more than 256 metrics.
One metric line must be supplied for each metric you want recorded. The metric lines have the following format:
metric_name [interval]
Process contexts have a name consisting of the process ID, a ~ (tilde), and the name of the executing program. To reach a statistic for a specific process, you can specify the process context name as either the process ID followed by the tilde, or the name of the executing program. The example below shows how to specify a statistic for the wait pseudo process, which, on AIX Version 3.2, always has a process ID of 514. Both lines point to the same statistic.
Proc/514~/usercpu Proc/wait/usercpu
If you specify a name of a program currently executing in more than one process, only the first one encountered is used. Generally, recording of process statistics from xmservd is discouraged except for processes that are expected to never die. If a process dies, it is deleted from the statset and is not added back, should the process be restarted later.
The start-stop lines specify when recordings shall start and stop. Multiple lines may be used. The format of a start-stop line is:
start dd hh mm dd hh mm
The first set of dd hh mm values specifies the time to start recording; the second set specifies the time to stop recording.
Exercise care when matching start and stop times -- especially when using multiple start-stop lines. It can be difficult to do so without plotting the recording intervals on a time scale. Therefore, the programxmscheck is available to preparse a recording configuration file and help you evaluate the resulting recording intervals.
The following examples help you understand how recording intervals are defined. First, consider the following start-stop line, which causes recording to take place for 10 minutes every half hour between 9 am and 6 pm on all weekdays. Notice that the last time recording starts every day is at 17:30 (5:30 pm).
start 1-5 9-17 0,30 1-5 9-17 10,40
If another start-stop line was added, that line would augment the first one. This is done by laying the intervals out on a time scale where all start and stop points are marked. The time scale is then processed from the beginning, creating a final set of start and stop marks by eliminating all stop marks that fall at the same minute as a start mark. Assume we supply the following two start-stop lines:
start 1-5 9-17 0,30 1-5 9-17 10,40 start 5 18-19 0,30 5 18-19 10,40
This would cause recording to take place for 10 minutes every half hour between 9 am and 6 pm on the first four weekdays and between 9 am and 8 pm on Fridays. The same could have been specified with:
start 1-4 9-17 0,30 1-4 9-17 10,40 start 5 9-19 0,30 5 9-19 10,40
The time scale created by xmservd does not wrap to the next week. Therefore, if you want recording from 11.30 pm to 12.30 am every night of the week, you need two lines:
start 0-6 23 30 1-6 00 30 start 0 0 0 0 0 30
For continuous recording at all times, specify:
start 0 0 0 0 0 0
# SAMPLE RECORDING CONFIGURATION FILE # Keep files at least 7 days and let each file contain # two day's recordings retain 7 2 # Set default sampling interval to one minute frequency 60000 # Give five statistics to record with default frequency CPU/cpu0/user CPU/cpu0/kern Mem/Real/sysrepag Mem/Virt/pagein Mem/Virt/steal # Two additional statistics are recorded every 20 seconds IP/NetIF/tr0/ioctet 20000 IP/NetIF/tr0/ooctet 20000 # record every weekday from 8.30 am to 5 pm, except during # the lunch hour from noon to 1 pm start 1-5 8 30 1-5 17 0 start 1-5 13 0 1-5 12 0
When xmservd is started with the command line argument -v, its recording configuration file parser writes the result of the parsing to the log file. The output includes a copy of all lines in the recording configuration file, any error messages, and a map of the time scale with indication of when recording starts and stops.
While this is useful to document what is read from the recording configuration file, it is not a very useful tool for debugging of a new or modified recording configuration file. Therefore, the program xmscheck is available to preparse a recording configuration file before you move it to the directory /etc/perf, where xmservd looks for the recording configuration file.
When xmscheck is started without any command line argument, it parses the file /etc/perf/xmservd.cf. This way, you can determine how the running daemon is configured for recording. If a file name is specified on the command line, that file is parsed.
Output from xmscheck goes to stdout. The parsing is done by the exact same module that does the parsing in xmservd. That module is linked in as part of both programs. The parsing checks that all statistics specified are valid and prints the time scale for starting and stopping recording in the form of a "time table."
In the time table, each minute has a numeric code. The meaning of codes is as follows:
The following Sample xmscheck Time Table Formatting shows how xmscheck formats the time table. Only the part of the table that covers Tuesday is shown. The Example Recording Configuration File was used to produce this output.
Day 2, Hour 00: 000000000000000000000000000000000000000000000000000000000000 Day 2, Hour 01: 000000000000000000000000000000000000000000000000000000000000 Day 2, Hour 02: 000000000000000000000000000000000000000000000000000000000000 Day 2, Hour 03: 000000000000000000000000000000000000000000000000000000000000 Day 2, Hour 04: 000000000000000000000000000000000000000000000000000000000000 Day 2, Hour 05: 000000000000000000000000000000000000000000000000000000000000 Day 2, Hour 06: 000000000000000000000000000000000000000000000000000000000000 Day 2, Hour 07: 000000000000000000000000000000000000000000000000000000000000 Day 2, Hour 08: 000000000000000000000000000000311111111111111111111111111111 Day 2, Hour 09: 111111111111111111111111111111111111111111111111111111111111 Day 2, Hour 10: 111111111111111111111111111111111111111111111111111111111111 Day 2, Hour 11: 111111111111111111111111111111111111111111111111111111111111 Day 2, Hour 12: 200000000000000000000000000000000000000000000000000000000000 Day 2, Hour 13: 311111111111111111111111111111111111111111111111111111111111 Day 2, Hour 14: 111111111111111111111111111111111111111111111111111111111111 Day 2, Hour 15: 111111111111111111111111111111111111111111111111111111111111 Day 2, Hour 16: 111111111111111111111111111111111111111111111111111111111111 Day 2, Hour 17: 200000000000000000000000000000000000000000000000000000000000 Day 2, Hour 18: 000000000000000000000000000000000000000000000000000000000000 Day 2, Hour 19: 000000000000000000000000000000000000000000000000000000000000 Day 2, Hour 20: 000000000000000000000000000000000000000000000000000000000000 Day 2, Hour 21: 000000000000000000000000000000000000000000000000000000000000 Day 2, Hour 22: 000000000000000000000000000000000000000000000000000000000000 Day 2, Hour 23: 000000000000000000000000000000000000000000000000000000000000