Performance tuning is primarily a matter of resource management and proper system parameter setting. Tuning the workload and the system for efficient resource use consists of the following steps:
It is essential that all of the work performed by the system be identified. Especially in LAN-connected systems, a complex set of cross-mounted file systems can easily develop with only informal agreement among the users of the systems. These must be identified and taken into account as part of any tuning activity.
With multiuser workloads, the analyst must quantify both the typical and peak request rates. It's also important to be realistic about the proportion of the time that a user is actually interacting with the terminal.
An important element of this stage is determining whether the measurement and tuning activity has to be done on the production system or can be accomplished on another system (or off-shift) with a simulated version of the actual workload. The analyst must weigh the greater authenticity of results from a production environment against the flexibility of the nonproduction environment, where the analyst can perform experiments that risk performance degradation or worse.
Objectives must be set in terms of measurable quantities, yet the actual desired result is often subjective, such as "satisfactory" response time. Further, the analyst must resist the temptation to tune what is measurable rather than what is important. If no system-provided measurement corresponds to the desired improvement, one must be devised.
The most valuable aspect of quantifying the objectives is not selecting numbers to be achieved, but making a public decision about the relative importance of (usually) multiple objectives. Unless these priorities are set in advance, and understood by all concerned, the analyst cannot make trade-off decisions without incessant consultation and is apt to be surprised by the reaction of users or management to aspects of performance that have been ignored. If the support and use of the system crosses organizational boundaries, a written service-level agreement between the providers and the users may be needed to ensure that there is a clear common understanding of the performance objectives and priorities.
In general, the performance of a given workload is determined by the availability and speed of one or two critical system resources. The analyst must identify those resources correctly or risk falling into an endless trial-and-error operation.
Systems have both real and logical resources. Critical real resources are generally easier to identify, since more system performance tools are available to assess the utilization of real resources. The real resources that most often affect performance are:
Logical resources are less readily identified. Logical resources are generally programming abstractions that partition real resources. The partitioning is done to share and manage the real resource.
Some examples of real resources and the logical resources built on them are:
It is important to be aware of logical resources as well as real resources. Threads can be blocked by lack of logical resources just as for lack of real resources, and expanding the underlying real resource does not necessarily ensure that additional logical resources will be created. For example, consider the NFS block I/O daemon (biod, see "NFS Tuning"). A biod on the client is required to handle each pending NFS remote I/O request. The number of biods therefore limits the number of NFS I/O operations that can be in progress simultaneously. When a shortage of biods exists, system instrumentation may indicate that the CPU and communications links are only slightly utilized. You may have the false impression that your system is underutilized (and slow), when in fact you have a shortage of biods that is constraining the rest of the resources. A biod uses processor cycles and memory, but you cannot fix this problem simply by adding real memory or converting to a faster CPU. The solution is to create more of the logical resource (biods).
Logical resources and bottlenecks can be created inadvertently during application development. A method of passing data or controlling a device may, in effect, create a logical resource. When such resources are created by accident, there are generally no tools to monitor their use and no interface to control their allocation. Their existence may not be appreciated until a specific performance problem highlights their importance.
The decision to use one resource over another should be done consciously and with specific goals in mind. An example of a resource choice during application development would be a trade-off of increased memory consumption for reduced CPU consumption. A common system configuration decision that demonstrates resource choice is whether to place files locally on an individual workstation or remotely on a server.
For locally developed applications, the programs can be reviewed for ways to perform the same function more efficiently or to remove unnecessary function. At a system-management level, low-priority workloads that are contending for the critical resource can be moved to other systems or run at other times.
Since workloads require multiple system resources to run, take advantage of the fact that the resources are separate and can be consumed in parallel. For example, the AIX system read-ahead algorithm detects the fact that a program is accessing a file sequentially and schedules additional sequential reads to be done in parallel with the application's processing of the previous data. Parallelism applies to system management as well. For example, if an application accesses two or more files at the same time, adding a disk drive may improve the disk-I/O rate if the files that are accessed at the same time are placed on different drives.
AIX provides a number of ways of prioritizing activities. Some, such as disk pacing, are set at the system level. Others, such as process priority, can be set by individual users to reflect the importance they attach to a specific task.
A truism of performance analysis is that "there is always a next bottleneck." Reducing the use of one resource means that another resource limits throughput or response time. Suppose, for example, we have a system in which the utilization levels are:
CPU: 90% Disk: 70% Memory 60%
This workload is CPU-bound. If we successfully tune the workload so that the CPU load is reduced from 90 to 45%, we might expect a two-fold improvement in performance. Unfortunately, the workload is now I/O-limited, with utilizations of about:
CPU: 45% Disk: 90% Memory 60%
The improved CPU utilization allows the programs to submit disk requests sooner, but then we hit the ceiling imposed by the disk drive's capacity. The performance improvement is perhaps 30% instead of the 100% we had envisioned.
There is always a new critical resource. The important question is whether we have met the performance objectives with the resources at hand.
If, after all of the preceding approaches have been exhausted, the performance of the system still does not meet its objectives, the critical resource must be enhanced or expanded. If the critical resource is logical and the underlying real resource is adequate, the logical resource can be expanded for no additional cost. If the critical resource is real, the analyst must investigate some additional questions: