Contributed by jason on from the load-of-you-know-what dept.
Ariane van der Steldt (ariane@) posted a reply to the OpenBSD misc mailing list last month that offered some valuable insight into how load is calculated in the BSD kernel. This is a topic that comes up routinely but remains largely misunderstood by the average user.
Read on for Ariane's explanation and comparison to Linux load...
Load on linux and load on BSD are two completely different things. On linux I recall load being the number of processes running or blocking, or something based on that.
On BSD, load is the number of processes which have (wanted to) run at least once in the most recent 5-second window, with a degradation over time. So, if you have a process that wakes up every 5 seconds and prints the time on your console, you have a load average of 1. Load is not the number of cpu cycles used.
A high load is just that: high. It means you have a lot of processes that sometimes run. High load does not mean your performance is going down or whatever: I ran a test today which generated a load of 200, but only used 10% of the cpu and was very responsive.
You can't compare load on linux with load on bsd, I'd really appreciate if people stopped comparing apples and oranges. :P
If you are interested in the internals of the system: load is the black magic that keeps the scheduling fair compared to the number of processes.
I had a chance to discuss this with Bob Beck (beck@). He agreed with Ariane's explanation and added his own thoughts.
Yes.
More generally on most unixen load average is some measure of the size of the run queue - or the number of runnable processes over a set period. The above is essentially how openbsd calculates it. It will be slightly different everywhere else. Because it is number of runnable processes, over a time period, it can be quite deceiving if you're not aware of the basic idea behind it. It is not a measure of cpu usage.
(use vmstat 1 and look at user + system for a more accurate representation of cpu usage)
Thanks to Ariane and Bob for their insights.
(Comments are closed)
By vasek (vasek) on
To me, the problem seems to lie in the documentation, how the load average is defined there. The man page of w(1) says:
Based on Ariane's explanation (and I trust her), the number you get is not what is written in the man page. It is not an average number of jobs in the run queue in the past 1, 5 and 15 minutes. Length of a run queue is not taken into account. It is rather "a number of jobs that had been placed in the run queue during a 5-second window averaged over windows in last 1, 5 and 15 minutes". To me, definition in the man page is not exact, it is almost misleading.
As for the comparison of OpenBSD and Linux load average, look at how both system describe it in their man pages of getloadavg(3) and watch the difference.
OpenBSD here.
Linux, for example, at linux.die.net or at ubuntu.com.
How many differences did you spot? Yes, there are none. Both DESCRIPTIONs are identical.
While I really hate other people (and companies and government agencies too) blindly supposing you use the same software and hardware platform as they do and demanding from you and your system to behave according to their expectations ("now click with your mouse on the taskbar"), in this case, based on the two identical DESCRIPTIONs of load average above, expectations of Linux people in respect to load average seem to be legitimate. There is a fault either on side of Linux, on side of OpenBSD, or on both. It is in the documentation. Both systems try to approximate the same number (number of runnable processes, "load"), each in its own way. Hence the difference and the reocurring topic.
By Anonymous Coward (208.48.231.12) on
Comments
By sthen (85.158.44.149) on
By looking at the CPU use statistics. You can just watch top, or monitor "vmstat -w <number of seconds>" for a text-based scrolling display (last 3 columns are user, system, idle cpu%), or if you prefer something graphical, symon/syweb (in ports) are pretty good for this.
By Krunch (91.106.147.37) on