[ Previous | Next | Contents | Glossary | Home | Search ]
AIX Versions 3.2 and 4 Performance Tuning Guide

Controlling Contention for the CPU

Controlling the Priority of User Processes

User-process priorities can be manipulated using the nice or renice command or the setpri subroutine, and displayed with ps. An overview of priority is given in "Process and Thread Priority".

Running a Command at a Nonstandard Priority with nice

Any user can run a command at a lower than normal priority by using nice . Only root can use nice to run commands at higher than normal priority.

With nice, the user specifies a value to be added to or subtracted from the standard nice value. The modified nice value is used for the process that runs the specified command. The priority of the process is still non-fixed. That is, the priority value is still recalculated periodically based on the CPU usage, nice value, and minimum user-process-priority value.

The standard nice value of a foreground process is 20; the standard nice value of a background process is 24. The nice value is added to the minimum user-process-priority level (40) to obtain the initial priority value of the process. For example, the command:

$ nice -5 vmstat 10 3 >vmstat.out

causes the vmstat command to be run with a nice value of 25 (instead of 20), resulting in a base priority value of 65 (before any additions for recent CPU use)

If we were root, we could have run the vmstat at a higher priority with:

# nice --5 vmstat 10 3 >vmstat.out

If we were not root and issued that nice, the vmstat command would still be run, but at the standard nice value of 20, and nice would not issue any error message.

Setting a Fixed Priority with the setpri Subroutine

An application that runs under the root userid can use the setpri subroutine to set its own priority or that of another process. For example:

  retcode = setpri(0,59);

would give the current process a fixed priority of 59. If setpri fails, it returns -1.

The following program accepts a priority value and a list of process IDs and sets the priority of all of the processes to the specified value.

/*
   fixprocpri.c
   Usage: fixprocpri priority PID . . . 
*/
    
#include <sys/sched.h>
#include <stdio.h>
#include <sys/errno.h>
    
main(int argc,char **argv)
{
   pid_t ProcessID;
   int Priority,ReturnP;
    
   if( argc < 3 ) {
      printf(" usage - setpri priority pid(s) \n");
      exit(1);
   }
    
   argv++;
   Priority=atoi(*argv++);
   if ( Priority < 50 ) {
      printf(" Priority must be >= 50 \n");
      exit(1);
   }
    
   while (*argv) {
      ProcessID=atoi(*argv++);
      ReturnP = setpri(ProcessID, Priority);
      if ( ReturnP > 0 ) 
          printf("pid=%d new pri=%d  old pri=%d\n",
            (int)ProcessID,Priority,ReturnP);
      else {
          perror(" setpri failed ");
            exit(1);
      }
   }
}  

Displaying Process Priority with ps

The -l (lower-case L) flag of the ps command displays the nice values and current priority values of the specified processes. For example, we can display the priorities of all of the processes owned by a given user with:

# ps -lu waters
       F S UID  PID PPID   C PRI NI ADDR    SZ    WCHAN    TTY  TIME CMD
  241801 S 200 7032 7287   0  60 20 1b4c   108           pts/2  0:00 ksh
  200801 S 200 7569 7032   0  65 25 2310    88  5910a58  pts/2  0:00 vmstat
  241801 S 200 8544 6495   0  60 20 154b   108           pts/0  0:00 ksh

The output shows the result of the nice -5 command described earlier. Process 7569 has an effective priority of 65. (The ps command was run by a separate session in superuser mode, hence the presence of two TTYs.)

If one of the processes had used the setpri subroutine to give itself a fixed priority, the ps -l output format would be:

       F S UID   PID  PPID   C PRI NI ADDR    SZ    WCHAN    TTY  TIME CMD
  200903 S   0 10759 10500   0  59 -- 3438    40  4f91f98  pts/0  0:00 fixpri
  

Modifying the Priority of a Running Process with renice

Note: In the following discussion, the AIX Version 3 renice syntax is used. The next section discusses AIX Version 3 and 4 nice and renice syntax.

renice alters the nice value, and thus the priority, of one or more processes that are already running. The processes are identified either by process ID, process group ID, or the name of the user who owns the processes. renice cannot be used on fixed-priority processes.

To continue our example, we will renice the vmstat process that we started with nice.

# renice -5 7569
7569: old priority 5, new priority -5
# ps -lu waters
       F S UID  PID PPID   C PRI NI ADDR    SZ    WCHAN    TTY  TIME CMD
  241801 S 200 7032 7287   0  60 20 1b4c   108           pts/2  0:00 ksh
  200801 S 200 7569 7032   0  55 15 2310    92  5910a58  pts/2  0:00 vmstat
  241801 S 200 8544 6495   0  60 20 154b   108           pts/0  0:00 ksh

Now the process is running at a higher priority than the other foreground processes. Observe that renice does not add or subtract the specified amount from the old nice value. It replaces the old nice value with a new one. To undo the effects of all this playing around, we could issue:

# renice -0 7569
7569: old priority -5, new priority 0
# ps -lu waters
       F S UID  PID PPID   C PRI NI ADDR    SZ    WCHAN    TTY  TIME CMD
  241801 S 200 7032 7287   0  60 20 1b4c   108           pts/2  0:00 ksh
  200801 S 200 7569 7032   1  60 20 2310    92  5910a58  pts/2  0:00 vmstat
  241801 S 200 8544 6495   0  60 20 154b   108           pts/0  0:00 ksh

In these examples, renice was run by root. When run by an ordinary userid, there are two major limitations to the use of renice:

Clarification of nice/renice Syntax

AIX Version 3

The nice and renice commands have different ways of specifying the amount that is to be added to the standard nice value of 20.

With nice, the initial minus sign is required to identify the value, which is assumed to be positive. Specifying a negative value requires a second minus sign (with no intervening space).

With renice, the parameter following the command name is assumed to be the value, and it can be a signed or unsigned (positive) number. Thus the following pairs of commands are equivalent:

                       Resulting    Resulting
                       nice Value   Priority Value
nice -5    renice 5    25           65
nice -5    renice +5   25           65
nice - -5  renice -5   15           55

AIX Version 4

For AIX Version 4, the syntax of renice has been changed to complement the alternative syntax of nice, which uses the -n flag to identify the nice-value increment. The following table is the AIX Version 4 equivalent of the table in the preceding section:

                           Resulting    Resulting
                           nice Value   Priority Value
nice -n 5    renice -n 5   25           65
nice -n +5   renice -n +5  25           65
nice -n -5   renice -n -5  15           55

Tuning the Process-Priority-Value Calculation with schedtune

A recent enhancement of schedtune and the AIX CPU scheduler permits changes to the parameters used to calculate the priority value for each process. This enhancement is part of AIX Version 4 and is available in a PTF for AIX Version 3.2.5. See "Process and Thread Priority" for background information on priority.

Briefly, the formula for calculating the priority value is:

   priority value = base priority + nice value + (CPU penalty based on recent CPU usage)

The recent CPU usage value of a given process is incremented by 1 each time that process is in control of the CPU when the timer interrupt occurs (every 10 milliseconds). The recent CPU usage value is displayed as the "C" column in ps command output. The maximum value of recent CPU usage is 120.

The current algorithm calculates the CPU penalty by dividing recent CPU usage by 2. The CPU-penalty-to-recent-CPU-usage ratio is therefore .5. We will call this value R.

Once a second, the current algorithm divides the recent CPU usage value of every process by 2. The recent-CPU-usage-decay factor is therefore .5. We will call this value D.

For some users, the existing algorithm does not allow enough distinction between foreground and background processes. For example--ignoring other activity--if a system were running two compute-intensive user processes, one foreground (nice value = 20), one background (nice value = 24) that started at the same time, the following sequence would occur:

Even if the background process had been started with nice -20, the distinction between foreground and background would be only slightly clearer. Although the scheduler stops counting time slices used after 120, this permits the CPU penalty to level off at 60--more than enough to offset the maximum nice value difference of 40.

To allow greater flexibility in prioritizing processes, the new feature permits user tuning of the ratio of CPU penalty to recent CPU usage (R) and the recent-CPU-usage-decay rate (D). The tuning is accomplished through two new options of the schedtune command: -r and -d. Each option specifies a parameter that is an integer from 0 through 32. The parameters are applied by multiplying the recent CPU usage value by the parameter value and then dividing by 32 (shift right 5). The default r and d values are 16, which yields the same behavior as the original algorithm (D=R=16/32=.5). The new range of values permits a far wider spectrum of behaviors. For example:

# schedtune -r 0 

(R=0, D=.5) would mean that the CPU penalty was always 0, making priority absolute. No background process would get any CPU time unless there were no dispatchable foreground processes at all. The priority values of the processes would effectively be constant, although they would not technically be fixed-priority processes.

# schedtune -r 5

(R=.15625, D=.5) would mean that a foreground process would never have to compete with a background process started with nice -20. The limit of 120 CPU time slices accumulated would mean that the maximum CPU penalty for the foreground process would be 18.

# schedtune -r 6 -d 16

(R=.1875, D=.5) would mean that, if the background process were started with nice -20, it would be at least one second before the background process began to receive any CPU time. Foreground processes, however, would still be distinguishable on the basis of CPU usage. Long-running foreground processes that should probably be in the background would ultimately accumulate enough CPU usage to keep them from interfering with the true foreground.

# schedtune -r 32 -d 32

(R=1, D=1) would mean that long-running processes would reach a C value of 120 and stay there, contending on the basis of their nice values. New processes would have priority, regardless of their nice value, until they had accumulated enough time slices to bring them within the priority value range of the existing processes.

If you conclude that one or both parameters need to be modified to accommodate your workload, you can enter the schedtune command while logged on as root. The changed values will persist until the next schedtune that modifies them or until the next system boot. Values can be reset to their defaults with schedtune -D, but remember that all schedtune parameters are reset by that command, including VMM memory load control parameters. To make a change to the parameters that will persist across boots, you need to add an appropriate line at the end of the /etc/inittab file.


[ Previous | Next | Contents | Glossary | Home | Search ]