OpenBSD Journal

String cleanup results

Contributed by jose on from the catching-bugs dept.

an anonymous reader writes:



"List:     openbsd-misc
Subject:  string cleanup results
From:     Theo de Raadt

Date:     2003-04-19 19:54:53


Some of you might have noticed that we have been doing a 'string
cleaning'.  (I should probably apologize for this northern-hemisphere
specific pun).

That means that we have been going through the tree cleaning out all
calls to sprintf(), strcpy(), and strcat().  Instead, these things are
being rewritten to use asprintf(), snprintf(), strlcpy(), and
strlcat().

As a result, I can tell you that we have removed at approximately 2000
occurances of these functions.  I cannot provide exact figures since
that would require hand-counting the actual diffs from 8 weeks ago ;)
Source code grep'ing is not always accurate.

The goal is to remove potential overflows, by always calculating what
the bounds of an operation are.  In doing so, some code is now
converted from an overflow to a truncation.  In followup work, we wish
to continue investigating all cases of truncation in the tree, and see
what we can do better in those cases where truncation is undesirable.
Sometimes you want to handle it, sometimes it is ok, sometimes it is
really nasty.  Obviously, that's an even bigger job than what we have
done here, since the tree was already full of truncations.  Current
theory held by most is that most of these are innocious, but who
knows..  I want us to clean them (but I do not want us to go insane
either).

Obviously there is the possibility that this huge effort has
introduced a minor bug here or there, but it much more likely that we
have squished many much more major bugs.

I would estimate that more than 20 developers have been involved in
this process which we started in earnest about 6 weeks ago.

We are obviously not doing this to some parts of the tree which we
borrow from other projects.  In particular, the gnu part of the tree
might remain largely dirty.  However, we have done this in httpd and
openssl, and will maintain those tree components ourself if need be.
We have also convinced the bind people to move in this direction, and
they have incorporated these functions into bind -- that work is not
finished however.  Since sendmail already has renamed versions of
strlcpy and strlcat, we have tried convincing them to move completely
in this direction as well, and I think that they are doing that work
now.

Of the programs that are in our base set, and under consideration, we
have a few that still call these functions:

	routed, dump, restore, awk, indent, lam,
	make, xlint, bind, openssl, ppp, rtadvd

Pretty small list eh.

There might be a few more, but I am sure you can comprehend the
direction.  Of course, we have completely purified our libraries of
these functions as well, except that libc still provides them (in
accordance with standards and... well, expected behaviour).

The remaining ones are very difficult.  afs, gcc, binutils, etc.

I must also say that in the last week we've fixed some that are
incredibly difficult.  Some of them were definate buffer overflows.
Sometimes we hit a piece of code in a program that just about made
us cry.  A few changes involved 10 or more people before we settled
on a correct change.  Crazy stuff.

In a week or so there will also be changes made to ensure that the
kernel is completely clear of these functions too, but we are waiting
for the i386 ELF W^X work to settle into the tree.



(original message) "
This is what it is all about, folks. Fix bugs, and you fix vulnerabilities. Migrate to the correct, checked behavior, and you'll typically end up with safe code. And, just as importantly, push the fixes back to the original developers (such as Sendmail and Apache) so that everyone can benefit in the long run. Some bugs probably lurk in these fixes, so test them out and see what you find. This is a pretty impressive effort.

(Comments are closed)


Comments
  1. By Darren () darren@dazdaz.NOSPAM.org on mailto:darren@dazdaz.NOSPAM.org

    Any chance of changing that font with the email, it's too small to really read without my Patrick Moore monacle.

  2. By Jeffrey Neitzel () on

    In the couple weeks I've been running my primary system with these new changes I've not noticed any oddities in daily usage. I did my last update on 2003-04-16, and I've still seen no problems.

    Granted, this is still a.out .. not ELF. I'll probably have to wait a week before updating to the new ELF.

  3. By Anonymous Coward () on

    "libc still provides them"

    It would be cool if you put in compile-time warnings when someone uses these functions.

    #warning HEY DONT USE THIS FUNCTION MORON

    Comments
    1. By Anonymous Coward () on

      ever consider I would *WANT* to write insecure code?

    2. By pravus () on

      there are times when the "non-safe" versions are perfectly safe. in fact, using them in certain situations can offer a fair amount of efficiency. making a blanket "you are a moron if you use this" type of statement shows lack of thought... even Theo admitted that, in certain situations, the "non-safe" versions work fine (though i'd have to look up the article and i'm lazy...).

      i can say, however, that for a *huge* project such as OpenBSD, standardizing on one particular methodology or another does seem to make sense. it ensures consistent behavior and allows for faster development (you don't have to think about the consequences... they are already known).

    3. By Anonymous Coward () on

      Some OSes, like Digital UNIX, do not provide snprintf().

      Comments
      1. By ben () mouring@eviladmin.org on mailto:mouring@eviladmin.org

        There are some very nice, simple and portable [v]snprintf() libraries out there under BSD friendly licenses.

        Lot of projects have to work around broken snprintf(). So it is not a big deal.

      2. By Michael van der Westhuizen () on

        DU (well, OSF1/Tru64 UNIX) has snprintf and friends in the libc from 4.0g (but not in the header files), and the functions are fully present in 5.1a and above (and I suppose 5.0x, but I've never even seen that).

    4. By almeida () on

      A friendlier approach would be to use warnings similar to those generated at compile time by mktemp:

      warning: mktemp() possibly used unsafely; consider using mkstemp()


      Consider something like this for the string functions:

      warning: strcat() possibly used unsafely; consider using strlcat()

  4. By Gord () on

    Theo says, "Pretty small list eh."

    Canada, represent!

  5. By zil0g () on

    In the words of C. Montgomery Burns, "Eeexcellent"

    In other news today, seems troll-to-signal level on popular website deadly.org has gone down, is it for real, or just temporary?
    only time will tell.

    Thanks for putting up with this 'community' guys, hope it'll work.

    Comments
    1. By Kurt () on

      RE: Trolls... Note the IP is being displayed for all to see now. ;-)

Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]