OpenBSD Journal

BSD load demystified

Contributed by jason on from the load-of-you-know-what dept.

Ariane van der Steldt (ariane@) posted a reply to the OpenBSD misc mailing list last month that offered some valuable insight into how load is calculated in the BSD kernel. This is a topic that comes up routinely but remains largely misunderstood by the average user.

Read on for Ariane's explanation and comparison to Linux load...

Load on linux and load on BSD are two completely different things. On linux I recall load being the number of processes running or blocking, or something based on that.

On BSD, load is the number of processes which have (wanted to) run at least once in the most recent 5-second window, with a degradation over time. So, if you have a process that wakes up every 5 seconds and prints the time on your console, you have a load average of 1. Load is not the number of cpu cycles used.

A high load is just that: high. It means you have a lot of processes that sometimes run. High load does not mean your performance is going down or whatever: I ran a test today which generated a load of 200, but only used 10% of the cpu and was very responsive.

You can't compare load on linux with load on bsd, I'd really appreciate if people stopped comparing apples and oranges. :P

If you are interested in the internals of the system: load is the black magic that keeps the scheduling fair compared to the number of processes.

I had a chance to discuss this with Bob Beck (beck@). He agreed with Ariane's explanation and added his own thoughts.

Yes.

More generally on most unixen load average is some measure of the size of the run queue - or the number of runnable processes over a set period. The above is essentially how openbsd calculates it. It will be slightly different everywhere else. Because it is number of runnable processes, over a time period, it can be quite deceiving if you're not aware of the basic idea behind it. It is not a measure of cpu usage.

(use vmstat 1 and look at user + system for a more accurate representation of cpu usage)

Thanks to Ariane and Bob for their insights.

(Comments are closed)


  1. By vasek (vasek) on

    To me, the problem seems to lie in the documentation, how the load average is defined there. The man page of w(1) says:

         The load average numbers give the number of jobs in the run queue
         averaged over 1, 5 and 15 minutes
    

    Based on Ariane's explanation (and I trust her), the number you get is not what is written in the man page. It is not an average number of jobs in the run queue in the past 1, 5 and 15 minutes. Length of a run queue is not taken into account. It is rather "a number of jobs that had been placed in the run queue during a 5-second window averaged over windows in last 1, 5 and 15 minutes". To me, definition in the man page is not exact, it is almost misleading.

    As for the comparison of OpenBSD and Linux load average, look at how both system describe it in their man pages of getloadavg(3) and watch the difference.

    OpenBSD here.

    DESCRIPTION
         The getloadavg() function returns the number of processes in the system
         run queue averaged over various periods of time.  Up to nelem samples are
         retrieved and assigned to successive elements of loadavg[].  The system
         imposes a maximum of 3 samples, representing averages over the last 1, 5,
         and 15 minutes, respectively.
    

    Linux, for example, at linux.die.net or at ubuntu.com.

    DESCRIPTION
         The getloadavg() function returns the number of processes in the system
         run queue averaged over various periods of time.  Up to  nelem  samples
         are  retrieved  and  assigned to successive elements of loadavg[].  The
         system imposes a maximum of 3 samples, representing averages  over  the
         last 1, 5, and 15 minutes, respectively.
    

    How many differences did you spot? Yes, there are none. Both DESCRIPTIONs are identical.

    While I really hate other people (and companies and government agencies too) blindly supposing you use the same software and hardware platform as they do and demanding from you and your system to behave according to their expectations ("now click with your mouse on the taskbar"), in this case, based on the two identical DESCRIPTIONs of load average above, expectations of Linux people in respect to load average seem to be legitimate. There is a fault either on side of Linux, on side of OpenBSD, or on both. It is in the documentation. Both systems try to approximate the same number (number of runnable processes, "load"), each in its own way. Hence the difference and the reocurring topic.


  2. By Anonymous Coward (208.48.231.12) on

    so, on linux and bsd, how can you tell when a server could you a cpu upgrade besides benchmarking?

    1. By sthen (85.158.44.149) on

      > so, on linux and bsd, how can you tell when a server could you a cpu upgrade besides benchmarking?

      By looking at the CPU use statistics. You can just watch top, or monitor "vmstat -w <number of seconds>" for a text-based scrolling display (last 3 columns are user, system, idle cpu%), or if you prefer something graphical, symon/syweb (in ports) are pretty good for this.

  3. By Krunch (91.106.147.37) on

    For the record, on Linux the load average is based on the number of tasks (or threads, Linux has a 1-1 thread model) in D (uninterruptible sleep) or R (running or runnable) state. When you see a very large load average while the system is still responsive, it often means processes get stuck waiting for I/O because of broken driver.

Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]