OpenBSD Journal
Home : : Add Story : : Archives : : About : : Create Account : : Login :
Developer blog: cnst@: fixing make
Contributed by deanna on Tue Jun 19 10:42:41 2007 (GMT)
from the that's-a-lot-of-j dept.

Constantine A. Murenin (cnst@) writes:

A 10-year-old pointer-arithmetic bug in make(1) is now gone, thanks to malloc.conf and some debugging. The bug prevented people from running make -j on OpenBSD for fear of it looping on one the CPUs forever, without actually building anything.

Personally, I've encountered this bug a few times after todd@ told me about the -j option at c2k7. I immediately noticed performance improvements in building kernel with `make -j4` — which was obvious from the fact that now both CPUs were at close to 0% idle time for the duration of the build. However, I've also noticed that shortly after being started with a big -j number, like 16 or 24, the make process would become mysteriously quiet and would start to consume 100% of one of the CPUs (without actually running any jobs) every time I ran it.

Independently (or maybe due to the influence) of the above make(1) bug, one day I've decided to give the malloc options a try, with running `ln -s 'AFGHJPRX' /etc/malloc.conf` to set it up. After setting these malloc options, I've noticed that make was crashing quite often when used with the -j option.

To make the long story short, I've discovered that the problem had to do with pointer arithmetic. A number of bytes was added as an offset for a pointer of type fd_set, which has a size of 128. I.e. the offset in the number of bytes was erroneously multiplied by a factor of 128, thus doing a memset on the unallocated piece of memory, leaving the allocated part uninitialised.

Afterward, millert@ also identified that the original code from 1997 seems to have a potential memory leak due to incorrect realloc usage, so we've fixed that problem too, and a complete patch was committed to 4.1-current

Happy hacking!

[topicblog]

<< OpenBSD: Free As In Air | Reply | Threaded | OpenBSD runs great on Soekris 5501 >>

Threshold: Help

Related Links
more by deanna


  Re: Developer blog: cnst@: fixing make (mod 10/56)
by Anonymous Coward (193.63.217.208) on Tue Jun 19 11:48:18 2007 (GMT)
  Excellent, thanks. Will this be back-ported to -stable and 4.0 as a reliability fix?
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Developer blog: cnst@: fixing make (mod 5/45)
by Anonymous Coward (24.37.242.64) on Tue Jun 19 12:13:03 2007 (GMT)
  Awesome! Thank you!
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Developer blog: cnst@: fixing make (mod 10/48)
by Anonymous Coward (88.64.133.128) on Tue Jun 19 16:15:22 2007 (GMT)
  Damn it!

I thought that the bug of "looping on one the CPUs forever, without actually building anything" I encountered on many ports I compiled with my distcc port were related to distcc!

Now I have to reevaluate and rebenchmark distcc for ports. I hope this works out and faster port building with many "crap" machines on my home network becomes reality.

Thank you very much, great work!
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Developer blog: cnst@: fixing make (mod 7/53)
by Anonymous Coward (149.169.206.225) on Tue Jun 19 19:28:48 2007 (GMT)
  How much of a speed increase is this? Are we talking like a double in compile speed because you can multithread?
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Developer blog: cnst@: fixing make (mod -5/41)
by sean (139.142.208.98) on Tue Jun 19 20:32:27 2007 (GMT)
  > How much of a speed increase is this? Are we talking like a double in compile speed because you can multithread?

Gain, if any, would be less than double. Think about it for a bit. Performance does not scale linearly with each additional CPU.
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  speed increase due to -j16 as opposed to default -B (mod 7/47)
by cnst (cnst) on Tue Jun 19 20:54:45 2007 (GMT)
http://cnst.livejournal.com/24068.html#chart
  > How much of a speed increase is this? Are we talking like a double in compile speed because you can multithread?

The difference is substantial -- on Intel Core 2 Duo E4300, the time it takes to build OpenBSD's kernel with -B is 4:45, but with -j16 it's only 2:36, i.e. just under 3 minutes, and almost a two-fold decrease in elapsed time. See the link under my name above for some simple benchmarking that I've performed.
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Developer blog: cnst@: fixing make (mod 2/42)
by art (213.0.113.90) on Wed Jun 20 07:07:12 2007 (GMT)
  > > How much of a speed increase is this? Are we talking like a double in compile speed because you can multithread?
>
> Gain, if any, would be less than double. Think about it for a bit. Performance does not scale linearly with each additional CPU.
>

Actually, when you really think about it for a bit, the gain can sometimes be larger than double because compilations (depending on what's compiled of course) are often waiting for disk I/O.
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Developer blog: cnst@: fixing make (mod 1/45)
by Anonymous Coward (88.64.159.34) on Wed Jun 20 07:19:52 2007 (GMT)
  Now the only thing left to do is making the system build make -j safe, but I guess this would be a huge amount of work.
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: speed increase due to -j16 as opposed to default -B (mod -1/41)
by Anonymous Coward (68.98.37.61) on Wed Jun 20 12:50:43 2007 (GMT)
  > See the link under my name above for some simple benchmarking that I've performed.

Thanks for the chart. Does this make even using say -j4 (since -j2 looks like it has issues) on a non-mp system useful or does it block so much it's not worth it? It's kinda scary that you can fork bomb that easy with make...
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: speed increase due to -j16 as opposed to default -B (mod 0/36)
by Anonymous Coward (84.135.78.21) on Wed Jun 20 13:36:27 2007 (GMT)
  > > How much of a speed increase is this? Are we talking like a double in compile speed because you can multithread?
>
> The difference is substantial -- on Intel Core 2 Duo E4300, the time > it takes to build OpenBSD's kernel with -B is 4:45

4:45? O_O
I've got a T5500 and it takes around 30 minutes to compile.
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: speed increase due to -j16 as opposed to default -B (mod 0/40)
by Anonymous Coward (207.59.237.99) on Wed Jun 20 13:40:11 2007 (GMT)
  > > See the link under my name above for some simple benchmarking that I've performed.
>
> Thanks for the chart. Does this make even using say -j4 (since -j2 looks like it has issues) on a non-mp system useful or does it block so much it's not worth it? It's kinda scary that you can fork bomb that easy with make...

Why not time kernel compiles w/differring values of -j against a standard build? It's not like it's beyond your abilities to do so...
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Developer blog: cnst@: fixing make (mod -2/40)
by Anonymous Coward (72.11.69.215) on Wed Jun 20 17:00:55 2007 (GMT)
  > Excellent, thanks. Will this be back-ported to -stable and 4.0 as a reliability fix?

Is this a bug in GNU Make? Will this be patched upstream?

  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Developer blog: cnst@: fixing make (mod 4/42)
by Brad (brad) on Wed Jun 20 18:17:21 2007 (GMT)
  > > Excellent, thanks. Will this be back-ported to -stable and 4.0 as a reliability fix?
>
> Is this a bug in GNU Make? Will this be patched upstream?

No. OpenBSD uses BSD Make.
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: speed increase due to -j16 as opposed to default -B (mod 4/44)
by mk (130.225.243.71) on Wed Jun 20 19:07:29 2007 (GMT)
  > > > How much of a speed increase is this? Are we talking like a double in compile speed because you can multithread?
> >
> > The difference is substantial -- on Intel Core 2 Duo E4300, the time > it takes to build OpenBSD's kernel with -B is 4:45
>
> 4:45? O_O
> I've got a T5500 and it takes around 30 minutes to compile.

Slow disk? Broken interrupt routing? Very little memory? etc.

My X60 compiles a kernel in about 5:30 on GENERIC. I think it's about half of that using -j 12.
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: speed increase due to -j16 as opposed to default -B (mod -5/37)
by Anonymous Coward (217.112.38.94) on Wed Jun 20 20:47:05 2007 (GMT)
  > > > > How much of a speed increase is this? Are we talking like a double in compile speed because you can multithread?
> > >
> > > The difference is substantial -- on Intel Core 2 Duo E4300, the time > it takes to build OpenBSD's kernel with -B is 4:45
> >
> > 4:45? O_O
> > I've got a T5500 and it takes around 30 minutes to compile.
>
> Slow disk? Broken interrupt routing? Very little memory? etc.
>
> My X60 compiles a kernel in about 5:30 on GENERIC. I think it's about half of that using -j 12.

maybe he is talking about the whole system not just the kernel?...
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: speed increase due to -j16 as opposed to default -B (mod -4/38)
by Anonymous Coward (198.175.14.5) on Wed Jun 20 21:14:42 2007 (GMT)
  >
> It's kinda scary that you can fork bomb that easy with make...

Not really. It doesn't affect the whole kernel, just your user. Up your login.conf limits or ulimit if that's a problem.
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Developer blog: cnst@: fixing make (mod 11/51)
by Steve Shockley (68.80.137.106) on Wed Jun 20 23:50:48 2007 (GMT)
  malloc.conf kicks ass!
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: speed increase due to -j16 as opposed to default -B (mod -4/42)
by raw (84.135.70.128) on Thu Jun 21 17:29:07 2007 (GMT)
  > > > > > How much of a speed increase is this? Are we talking like a double in compile speed because you can multithread?
> > > >
> > > > The difference is substantial -- on Intel Core 2 Duo E4300, the time > it takes to build OpenBSD's kernel with -B is 4:45
> > >
> > > 4:45? O_O
> > > I've got a T5500 and it takes around 30 minutes to compile.
> >
> > Slow disk? Broken interrupt routing? Very little memory? etc.
> >
> > My X60 compiles a kernel in about 5:30 on GENERIC. I think it's about half of that using -j 12.
>
> maybe he is talking about the whole system not just the kernel?...

No, I'm talking about the _kernel_. This machine has 512MB of RAM. The disk is some hitachi sata and about broken interrupt routing, I don't know.
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Developer blog: cnst@: fixing make (mod -2/40)
by Anonymous Coward (139.142.208.98) on Thu Jun 21 18:32:59 2007 (GMT)
  > Actually, when you really think about it for a bit, the gain can sometimes be larger than double because compilations (depending on what's compiled of course) are often waiting for disk I/O.

I can see that though wouldn't not having SMP and using the -j option still see those benefits as the items waiting on io will be in biosleep and won't be running so the other make processes will chew the savings?


  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: speed increase due to -j16 as opposed to default -B (mod -1/37)
by scot bontrager (216.62.11.163) on Fri Jun 22 00:40:04 2007 (GMT)
  > > How much of a speed increase is this? Are we talking like a double in compile speed because you can multithread?
>
> The difference is substantial -- on Intel Core 2 Duo E4300, the time it takes to build OpenBSD's kernel with -B is 4:45, but with -j16 it's only 2:36, i.e. just under 3 minutes, and almost a two-fold decrease in elapsed time. See the link under my name above for some simple benchmarking that I've performed.

time make -B
289.080u 59.050s 5:43.48 101.3% 0+0k 208+2226io 416pf+0w

time make -j4
291.050u 66.180s 3:27.57 172.1% 0+0k 1632+5575io 7112pf+0w

amd64 2x 1.6ghz
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

[ Home | Add Story | Archives | Polls | About ]

Copyright © 2004-2008 Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to April 2nd 2004 as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. Some icons from slashdot.org used with permission from Kathleen. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. Search engine is ht://Dig. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]