Contributed by
Niall O'Higgins
on
from the RCS-RCS-RCS dept.
Now that you're familiar with the RCS file format, I thought I'd write
a bit about revision numbers and some the issues surrounding them.
Usually, revision numbers are of the form `.' e.g. 1.1,
1.117, 2.1, 3.50. Numbers in this form are said to be on the trunk or
main branch, and they are linked in descending order. E.g. 1.30 points
to 1.29 which points to 1.28. Of course, these revision numbers can
be arbitrary, as long as they are greater than the last. So 1.30
could point to 1.3 which could point to 1.1. Then there is the notion
of the HEAD revision, which should always point to the highest of such
number pairs. Typically, HEAD is the most recent revision. Things
become more complicated with default branches and "magic" branch
numbers and all this kind of stuff - I'm not going to write about that
right now.
In RCS, many command line options accept revision
numbers as optional arguments. E.g. ci -l -f -i
-j -k -u then you have co -f -I
-p -M ... you get the picture. Clearly, revision numbers
constitute very important inputs to the RCS tools. Therefore, how we
handle these revision numbers should be a well-tested area of our
implementation.
One approach to testing software is to automatically generate
"interesting" values as inputs for functions. For example, if a
function accepts a character string as an input, run it with
zero-length strings, exceptionally long strings, strings filled with
randomly generated characters etc - and see how it behaves.
Similarly, if a function accepts integer values, run it with very
large numbers (see how it deals with integer overflows), negative
numbers, zero, one, etc. In some languages, it is possible to do
automatic static verification - e.g. Java - and this can be done at
the source code level. See for example this
paper or this
paper for more information on this area. Unfortunately, C has some
qualities which make this impractical. However, since this is UNIX,
if we take a higher level view we can abstract away from literal
functions in the source code to treating the program itself as a
function. From this perspective, the command (e.g. ci) is a
function which transforms various inputs (standard input, some files,
command line options) into some output (standard output/error, some
files, error code).
Using this approach, we can automatically test a
large quantity of boundary cases which would not normally be tested by
humans - and we find some interesting bugs!
In our case, we could compare the behaviour of our RCS implementation
with the existing GNU implementation. During testing, we found some
erroneous assumptions in our code. For example, we simply weren't
expecting the value zero or one to be passed as a revision, and so we
hadn't added proper handling for it. Furthermore, our revision number
handling API did not cope well with very large revision numbers -
resulting in integer overflows.
This has certainly demonstrated to me that fully automated testing of
this nature can expose bugs which might otherwise go un-noticed. It
also aids greatly in pointing out where we might differ from the
reference implementation in subtle ways (exit codes, standard
output/error).
(Comments are closed)
Comments
By
tedu (69.12.168.114)
on
another paper which may be interesting, about auto-generation of C test cases:
http://www.stanford.edu/~engler/spin05.pdf
Comments
By
Nate (65.94.97.106)
on
Hey tedu, since you work there, do you know why OpenVPN, OpenLDAP and FreeBSD are on scan.coverity.com, but OpenBSD isn't?
Comments
By
dlg (220.245.180.133) loki@animata.net
on
i think they have to be able to build to software on the platform their test software runs on. getting openbsd to build anywhere but on openbsd is not fun.
Comments
By
Amir Mesry (66.23.227.241) starkiller@web-illusions.net
on
Might be because it hasn't found any in it.
By
tedu (69.12.168.114)
on
mostly, anything missing is not there because of the feasibility (or lack) of building it in our current setup. the project is not done, however.
By
niallo (83.147.128.114)
on
Very interesting paper indeed. Their tool looks most promising, I'd love to have a look at their CIL transformations.
By
SH (82.182.103.172)
on
This has certainly demonstrated to me that fully automated testing of this nature can expose bugs which might otherwise go un-noticed.
Subversion has a big automated test suite that is very useful. Takes some time to complete, as anyone that has done a "make regress" in the svn port has noticed.
By
Corentin (81.56.152.193)
on
> During testing, we found some erroneous assumptions in our code.
What is your opinion (other OpenBSD developers and users are of course encouraged to reply to this question as well) regarding the correct use of assertions to detect erroneous assumptions? I am a strong advocate of their use (only when appropriate, of course; i.e. not to do user error handling but to catch developer mistakes) so I am interested in knowing what other careful developers do think of them.
Comments
By
Otto Moerbeek (213.84.84.111) otto@drijf.net
on
Otto Moerbeek
Assertions can sometimes help when developing code; but they have no place in production code. Error handling (for ANY error) should be an integral part of the code, including resource cleanup, messages, recovery, whatever is needed. Assertions let you chicken out; they give you an excuse not to handle errors properly.
Another BIG drawback of assertions is that it's easy to introduce side effects that either hide bugs or introduce bugs, making the behaviour of the programs compiled with and without assertions not the same.
Assertions suck. It makes usually for lousy coding practices. If for some reason a piece of code needs an assertion it also means it needs that code during production. The fact that assertions don't happen during development is no guarantee that they won't ever happen. Therefore it is a bad coding practice that should be quelled.
By
niallo (83.147.128.114)
on
I have actually never considered used assert(). In OpenCVS/RCS we do however make use of fatal() which is related I guess. However, as in OpenSSH, it's almost exclusively used in the context of memory management (xmalloc, xfree, xrealloc, etc).
I would agree with Otto and Marco, its crucial to handle errors properly in the first place.
Comments
By
niallo (83.147.128.114)
on
Oops, I meant to write:
In OpenCVS/RCS we do however make use of fatal(), as in OpenSSH, which is related I guess. However, it's mostly used in the context of memory management (xmalloc, xfree, xrealloc, etc).
Also its important to note that fatal() is always going to work the same way, it won't get compiled out in production builds like assert().
Comments
By
Corentin (81.56.152.193)
on
Well, I think you are completely right (Otto and Marco are, too): assertions suck at error handling... because they are not a error handling mechanism at all; I know there are so many people using them for error handling because they are perhaps too lazy to write correct error handling code instead. But they are really meant to prove the correctness of a piece of code and are great IMHO to easily catch a few subtle "must never happen" *bugs* (not the exceptions that can and will happen such as user/system errors, bad inputs, etc.)
Anyway, thanks all for replying! It is always great to know what you do think about coding practices like that (I was really curious about that one, because I noticed they were not in heavy use in the OpenBSD source tree and so I thought you probably had a good reason to avoid them).
By tedu (69.12.168.114) on
http://www.stanford.edu/~engler/spin05.pdf
Comments
By Nate (65.94.97.106) on
Comments
By dlg (220.245.180.133) loki@animata.net on
Comments
By Amir Mesry (66.23.227.241) starkiller@web-illusions.net on
By tedu (69.12.168.114) on
By niallo (83.147.128.114) on
By SH (82.182.103.172) on
Subversion has a big automated test suite that is very useful. Takes some time to complete, as anyone that has done a "make regress" in the svn port has noticed.
By Corentin (81.56.152.193) on
What is your opinion (other OpenBSD developers and users are of course encouraged to reply to this question as well) regarding the correct use of assertions to detect erroneous assumptions? I am a strong advocate of their use (only when appropriate, of course; i.e. not to do user error handling but to catch developer mistakes) so I am interested in knowing what other careful developers do think of them.
Comments
By Otto Moerbeek (213.84.84.111) otto@drijf.net on Otto Moerbeek
Another BIG drawback of assertions is that it's easy to introduce side effects that either hide bugs or introduce bugs, making the behaviour of the programs compiled with and without assertions not the same.
By Marco Peereboom (67.64.89.177) marco@peereboom.us on http://www.peereboom.us
By niallo (83.147.128.114) on
I would agree with Otto and Marco, its crucial to handle errors properly in the first place.
Comments
By niallo (83.147.128.114) on
In OpenCVS/RCS we do however make use of fatal(), as in OpenSSH, which is related I guess. However, it's mostly used in the context of memory management (xmalloc, xfree, xrealloc, etc).
Also its important to note that fatal() is always going to work the same way, it won't get compiled out in production builds like assert().
Comments
By Corentin (81.56.152.193) on
Anyway, thanks all for replying! It is always great to know what you do think about coding practices like that (I was really curious about that one, because I noticed they were not in heavy use in the OpenBSD source tree and so I thought you probably had a good reason to avoid them).