Linux CPU Bottleneck

CPU Bottlenecks

The determination of whether the CPU is a performance bottleneck is largely a subjective one. This determination is based on the desired performance from the system. The user who wants rapid response time will use a different criteria than the user who is more interested in supporting a large number of jobs that are not interactive.

If the CPU is not spending a lot of time idle (less than 15 percent), then threads which are runnable have a higher likelihood of being forced to wait before being put into execution. In general, if the CPU is spending more than 70 percent in user mode, the application load might need some balancing. In most cases, 30 percent is a good highwater mark for the amount of time that should be spent in system mode.

Runnable threads waiting for a CPU are placed on the dispatch queue (or run queue). If threads are consistently being forced to wait on the dispatch queue before being put into execution, then the CPU is impeding performance to some degree. For a uniprocessor, more than two threads waiting on the dispatch queue might signal a CPU bottleneck. A run queue more than four times the number of CPUs indicates processes are waiting too long for a slice of CPU time. If this is the case, adding more CPU power to the system would be beneficial.

A slower response time from applications might also indicate that the CPU is a bottleneck. These applications may be waiting for access to the CPU for longer periods of time than normal. Consider locking and cache effects if no other cause for the performance problem can be found.

CPU Bottleneck Solutions

The priocntl and nice can be used to affect a thread’s priority. Since threads are selected for execution based on priority, threads that need faster access to the CPU can have their priority raised. You can also use processor sets to guarantee or limit access to CPUs.

The dispatch parameter tables can be modified to favor threads that are CPU intensive or I/O bound. They could also be modified to favor threads whose priorities fall within a specific range of priorities.

The CPU bottleneck might be the result of some other problem on the system. For example, a memory shortfall will cause the page daemon to execute more often. Other lower priority threads will not be able to run while the page daemon is in execution. Checking for system daemons which are using up an abnormally large amount of CPU time may indicate where the real problem lies.

vmstat -i should be used to determine if a device is interrupting an abnormally large number of times. Each time a device interrupts, the kernel must run the interrupt handler for that device. Interrupt threads run at an extremely high priority.

Lowering the process load by forcing some applications to run at different times might help. Modifying max_nprocs will reduce the total number of processes that are allowed to exist on the system at one time.

Any custom device drivers or third-party software should be checked for possible inefficiencies, especially unnecessary programmed I/O (PIO) usage with faster CPUs.

Lastly, if all else fails, adding more CPUs might alleviate some of the CPU load.