This topic includes the following major sections:
Perhaps the best tool for an overall look at resource utilization while running a multiuser workload is the vmstat command. The vmstat command reports CPU and disk-I/O activity as well as memory utilization data. The command
$ vmstat 5
causes the vmstat command to begin writing a one-line summary report of system activity every 5 seconds. Since no count was specified following the interval, reporting continues until the command is cancelled.
The following vmstat report was made on a system running AIXwindows and several synthetic applications (some low-activity intervals have been removed):
procs memory page faults cpu ----- ----------- ------------------------ ------------ ----------- r b avm fre re pi po fr sr cy in sy cs us sy id wa 0 0 8793 81 0 0 0 1 7 0 125 42 30 1 2 95 2 0 0 8793 80 0 0 0 0 0 0 155 113 79 14 8 78 0 0 0 8793 57 0 3 0 0 0 0 178 28 69 1 12 81 6 0 0 9192 66 0 0 16 81 167 0 151 32 34 1 6 77 16 0 0 9193 65 0 0 0 0 0 0 117 29 26 1 3 96 0 0 0 9193 65 0 0 0 0 0 0 120 30 31 1 3 95 0 0 0 9693 69 0 0 53 100 216 0 168 27 57 1 4 63 33 0 0 9693 69 0 0 0 0 0 0 134 96 60 12 4 84 0 0 0 10193 57 0 0 0 0 0 0 124 29 32 1 3 94 2 0 0 11194 64 0 0 38 201 1080 0 168 29 57 2 8 62 29 0 0 11194 63 0 0 0 0 0 0 141 111 65 12 7 81 0 0 0 5480 755 3 1 0 0 0 0 154 107 71 13 8 78 2 0 0 5467 5747 0 3 0 0 0 0 167 39 68 1 16 79 5 0 1 4797 5821 0 21 0 0 0 0 191 192 125 20 5 42 33 0 1 3778 6119 0 24 0 0 0 0 188 170 98 5 8 41 46 0 0 3751 6139 0 0 0 0 0 0 145 24 54 1 10 89 0
The columns of interest for this initial assessment are pi and po in the page category and the four columns in the cpu category.
By "approaching its limits," we mean that some parts of the workload are already experiencing a slowdown due to the critical resource. The longer response times may not be subjectively significant yet, but an increase in that element of the workload will cause a rapid deterioration of performance.
If vmstat indicates a significant amount of I/O wait time, an iostat will give more detailed information. The command
$ iostat 5 3
causes iostat to begin writing summary reports of I/O activity and CPU utilization every 5 seconds. Since a count of 3 was specified following the interval, reporting will stop after the third report.
The following iostat report was made on a system running the same workload as the vmstat reports above, but at a different time. The first report is for the cumulative activity since the preceding boot, while subsequent reports are for activity during the preceding 5-second interval:
tty: tin tout cpu: % user % sys % idle %iowait 0.0 4.3 0.2 0.6 98.8 0.4 Disks: % tm_act Kbps tps msps Kb_read Kb_wrtn hdisk0 0.0 0.2 0.0 7993 4408 hdisk1 0.0 0.0 0.0 2179 1692 hdisk2 0.4 1.5 0.3 67548 59151 cd0 0.0 0.0 0.0 0 0 tty: tin tout cpu: % user % sys % idle %iowait 0.0 30.3 8.8 7.2 83.9 0.2 Disks: % tm_act Kbps tps msps Kb_read Kb_wrtn hdisk0 0.2 0.8 0.2 4 0 hdisk1 0.0 0.0 0.0 0 0 hdisk2 0.0 0.0 0.0 0 0 cd0 0.0 0.0 0.0 0 0 tty: tin tout cpu: % user % sys % idle %iowait 0.0 8.4 0.2 5.8 0.0 93.8 Disks: % tm_act Kbps tps msps Kb_read Kb_wrtn hdisk0 0.0 0.0 0.0 0 0 hdisk1 0.0 0.0 0.0 0 0 hdisk2 98.4 575.6 61.9 396 2488 cd0 0.0 0.0 0.0 0 0
The first report, which displays cumulative activity since the last boot, shows that the I/O on this system is unbalanced. Most of the I/O (86.9% of kilobytes read and 90.7% of kilobytes written) is to hdisk2 , which contains both the operating system and the paging space. The cumulative CPU utilization since boot statistic is usually meaningless, unless the system is used consistently 24 hours a day.
The second report shows a small amount of disk activity reading from hdisk0, which contains a separate file system for the system's primary user. The CPU activity arises from two application programs and iostat itself. Although iostat's output is redirected to a file, the output is not voluminous, and the system is not sufficiently memory-constrained to force any output during this interval.
In the third report, we have artificially created a near-thrashing condition by running a program that allocates, and stores into, a large amount of memory (about 26MB in this example). hdisk2 is active 98.4% of the time, which results in 93.8% I/O wait. The fact that a single program that uses more than three-fourths of the system's memory (32MB) can cause the system to thrash reminds us of the limits of VMM memory load control. Even with a more homogeneous workload, we need to understand the memory requirements of the components.
If vmstat indicates that there is a significant amount of CPU idle time when the system seems subjectively to be running slowly, you may be experiencing delays due to kernel lock contention. In AIX Version 4, this possibility can be investigated with the lockstat command if the Performance Toolbox is installed on your system.
If you are the sole user of a system, you can get a general idea of whether a program is I/O or CPU dependent by using the time command as follows:
$ time cp foo.in foo.out
real 0m0.13s user 0m0.01s sys 0m0.02s
Note: Examples of the time command here and elsewhere in this guide use the version that is built into the Korn shell. The official time command (/usr/bin/time) reports with a lower precision and has other disadvantages.
In this example, the fact that the real, elapsed time for the execution of the cp (.13 seconds) is significantly greater than the sum (.03 seconds) of the user and system CPU times indicates that the program is I/O bound. This occurs primarily because foo.in has not been read recently. Running the same command a few seconds later against the same file gives:
real 0m0.06s user 0m0.01s sys 0m0.03s
Most or all of the pages of foo.in are still in memory because there has been no intervening process to cause them to be reclaimed and because the file is small compared with the amount of RAM on the system. A small foo.out would also be buffered in memory, and a program using it as input would show little disk dependency.
If you are trying to determine the disk dependency of a program, you have to be sure that its input is in an authentic state. That is, if the program will normally be run against a file that has not been accessed recently, you must make sure that the file used in measuring the program is not in memory. If, on the other hand, a program is usually run as part of a standard sequence in which it gets its input from the output of the preceding program, you should prime memory to ensure that the measurement is authentic. For example,
$ cp foo.in /dev/null
would have the effect of priming memory with the pages of foo.in.
The situation is more complex if the file is large compared to RAM. If the output of one program is the input of the next and the entire file won't fit in RAM, the second program will end up reading pages at the head of the file, which displace pages at the end. Although this situation is very hard to simulate authentically, it is nearly equivalent to one in which no disk caching takes place.
The case of a file that is (perhaps just slightly) larger than RAM is a special case of the RAM versus disk analysis discussed in the next section.
Just as a large fraction of real memory is available for buffering files, the system's page space is available as temporary storage for program working data that has been forced out of RAM. Suppose that you have a program that reads little or no data and yet shows the symptoms of being I/O dependent. Worse, the ratio of real time to user + system time does not improve with successive runs. The program is probably memory-limited, and its I/O is to, and possibly from, the paging space. A way to check on this possibility is shown in the following vmstatit shell script. The vmstatit script summarizes the voluminous vmstat -s report, which gives cumulative counts for a number of system activities since the system was started:
vmstat -s >temp.file # cumulative counts before the command time $1 # command under test vmstat -s >>temp.file # cumulative counts after execution grep "pagi.*ins" temp.file >>results # extract only the data grep "pagi.*outs" temp.file >>results # of interest
If the shell script is run as follows:
$ vmstatit "cp file1 file2" 2>results
real 0m0.03s user 0m0.01s sys 0m0.02s 2323 paging space page ins 2323 paging space page ins 4850 paging space page outs 4850 paging space page outs
The fact that the before-and-after paging statistics are identical confirms our belief that the cp command is not paging bound. An extended variant of the vmstatit shell script can be used to show the true situation:
vmstat -s >temp.file time $1 vmstat -s >>temp.file echo "Ordinary Input:" >>results grep "^[ 0-9]*page ins" temp.file >>results echo "Ordinary Output:" >>results grep "^[ 0-9]*page outs" temp.file >>results echo "True Paging Output:" >>results grep "pagi.*outs" temp.file >>results echo "True Paging Input:" >>results grep "pagi.*ins" temp.file >>results
Because all ordinary I/O in the AIX operating system is processed via the VMM, the vmstat -s command reports ordinary program I/O as page ins and page outs. When the above version of the vmstatit shell script was run against the cp command of a large file that had not been read recently, the result was:
real 0m2.09s user 0m0.03s sys 0m0.74s Ordinary Input: 46416 page ins 47132 page ins Ordinary Output: 146483 page outs 147012 page outs True Paging Output: 4854 paging space page outs 4854 paging space page outs True Paging Input: 2527 paging space page ins 2527 paging space page ins
The time command output confirms the existence of an I/O dependency. The increase in page ins shows the I/O necessary to satisfy the cp command. The increase in page outs indicates that the file is large enough to force the writing of dirty pages (not necessarily its own) from memory. The fact that there is no change in the cumulative paging-space-I/O counts confirms that the cp command does not build data structures large enough to overload the memory of the test machine.
The order in which this version of the vmstatit script reports I/O is intentional. Typical programs read file input and then write file output. Paging activity, on the other hand, typically begins with the writing out of a working-segment page that does not fit. The page is read back in only if the program tries to access it. The fact that the test system has experienced almost twice as many paging space page outs as paging space page ins since it was booted indicates that at least some of the programs that have been run on this system have stored data in memory that was not accessed again before the end of the program. "Memory-Limited Programs" provides more information. See also "Monitoring and Tuning Memory Use".
To show the effects of memory limitation on these statistics, the following example observes a given command in an environment of adequate memory (32MB) and then artificially shrinks the system using the rmss command (see "Assessing Memory Requirements via the rmss Command"). The command sequence
$ cc -c ed.c $ vmstatit "cc -c ed.c" 2>results
first primes memory with the 7944-line source file and the executable file of the C compiler, then measures the I/O activity of the second execution:
real 0m7.76s user 0m7.44s sys 0m0.15s Ordinary Input: 57192 page ins 57192 page ins Ordinary Output: 165516 page outs 165553 page outs True Paging Output: 10846 paging space page outs 10846 paging space page outs True Paging Input: 6409 paging space page ins 6409 paging space page ins
Clearly, this is not I/O limited. There is not even any I/O necessary to read the source code. If we then issue the command:
# rmss -c 8
to change the effective size of the machine to 8MB, and perform the same sequence of commands, we get:
real 0m9.87s user 0m7.70s sys 0m0.18s Ordinary Input: 57625 page ins 57809 page ins Ordinary Output: 165811 page outs 165882 page outs True Paging Output: 11010 paging space page outs 11061 paging space page outs True Paging Input: 6623 paging space page ins 6701 paging space page ins
The symptoms of I/O dependency are present:
The fact that the elapsed time is longer than in the memory-unconstrained situation, and the existence of significant amounts of paging-space I/O, make it clear that the compiler is being hampered by insufficient memory.
Note: This example illustrates the effects of memory constraint. No effort was made to minimize the use of memory by other processes, so the absolute size at which the compiler was forced to page in this environment does not constitute a meaningful measurement.
To avoid working with an artificially shrunken machine until the next restart, run
# rmss -r
to release back to the operating system the memory that the rmss command had sequestered, thus restoring the system to its normal capacity.