Contributed by Ray Lai on from the dept.
Today we'll be talking about the dangers of multiplication integer overflows in C, the type of bug responsible for the OpenSSH hole of 2002.
Integer overflows occur when an operation causes a number grow or shrink out of bounds, resulting in a much smaller or much larger number than expected. For example, a char can only hold 256 values. If two chars are multiplied together, there is a possibility of overflow:
char a, b; a = b = 127; a = a * b; /* overflow */
Because arithmetic is done in code all over the place, adding checks can be very tedious and is often ignored. A common idiom to allocate memory is:
size_t num, size; char *ptr; ... if ((ptr = malloc(num * size)) == NULL) /* overflow */ return (NULL);
The multiplication can create a number larger than a size_t can hold, causing an integer overflow. The integer overflow in this case causes less memory to be allocated than the program expects. The program could then take user input and write past the actual small allocation into memory it thinks it owns, causing problems like a generic buffer overflow. This was the idiom used in the OpenSSH hole of 2002, demonstrating the severity of these bugs.
We recommend using calloc(3) whenever allocating multiple objects of the same size. The above code would be written as:
size_t num, size; char *ptr; ... if ((ptr = calloc(num, size)) == NULL) return (NULL);
calloc(3) has checks to make sure the multiplication does not overflow.
Unfortunately there is no safe replacement for realloc(3). To prevent multiplication overflow, a check must be added:
size_t newsize, num, size; char *newptr, *ptr; ... if (num && size && SIZE_T_MAX / num < size) { errno = ENOMEM; return (NULL); } newsize = num * size; if ((newptr = realloc(ptr, newsize)) == NULL) return (NULL); ptr = newptr;
This is a hassle, but it is important to add these safeguards, especially when doing multiplication, which can quickly overflow.
To lessen developer work and simplify code, an xrealloc() function was added to ssh. xrealloc() has a similar API to calloc(3), taking both a num parameter and a size parameter. The above example can be written using xrealloc() as:
size_t num, size; char *newptr, *ptr; ... if ((newptr = xrealloc(ptr, num, size)) == NULL) return (NULL); ptr = newptr;
Due to its elegance, this API was also imported for OpenCVS.
For more information, please read a -current malloc(3) man page, which was recently updated with better usage examples.
(Comments are closed)
By Janne Johansson (130.237.95.193) jj@inet6.se on
inside malloc() since the mul is done (and overflowed) before it is called.
I compiled a simple version of your malloc overflow and the resulting
x86-asm for this [ptr=malloc(a*b);] looks like this just before the call:
movl -28(%ebp), %eax
imull -32(%ebp), %eax
pushl %eax
call malloc
If you could somehow shove a "check for integer overflow" (probably very
cpu-dependant but at least some cpus have such) between the imull and the
pushl+call you could (in theory) find and catch there kinds of errors in
very specific cases.
No, I have no such patch for gcc. 8-(
Comments
By Otto Moerbeek (213.84.84.111) otto@drijf.net on
Who is saying the multiplication is done right before the call to malloc?
What if some other, overflowing but harmless computation is done before the call to malloc.
etc etc etc
Comments
By Igor Sobrado (156.35.26.1) on
Janne Johansson has observed that a compiler could add some checks to the assembler code in a processor-dependant manner. Ok, perhaps it is not easy to implement, nor portable and, indeed, there is a risk that a previous arithmetic operation can set the overflow flag (OF) on the flags register (again, x86-centric); but it is a good approach even if it is not perfect.
That is not an "stupid" idea, at most it can be a not perfect one. But at least he proposed an idea, an interesting one.
Comments
By Anonymous Coward (128.171.90.200) on
It is next to impossible for a compiler to be able to tell when an integer overflow is going to take place, there are so many corner cases which the compiler would not know about and so many instances where a legitimate overflow could be prevented from happening.
Comments
By Igor Sobrado (156.35.192.2) on
It is not only impossible for a compiler to be able to detect if these events will happen but also to know the consequences of these events. An overflow can be a legitimate (but certainly ugly) event used by programmers---we cannot consider it an error in all cases. The only example I remind now is the ability of an arithmetic overflow to time wrap a Unix machine when its security level does not allow a legitimate time backward, but certainly it can be useful when coding real algorithms too.
An overflow just before a memory allocation process is not a valid reason for believing that there is a security problem related with the call to malloc()---I agree with Otto Moerbeek on this point. Even worse, an overflow that provokes a wrong memory allocation can happen hundreds (thousands!) instructions before the call to malloc() and remain undetected by the compiler.
I was only saying that, perhaps, the way Mr. Moerbeek replied to the post of Janne Johansson can make the author of the first post to get offended.
He did a fine observation about the impossibility of detecting all arithmetic overflows that can make a memory allocation to fail and he mentioned that some overflows happening just before a memory allocation can be harmless ones. Sometimes, arithmetic overflows are legal operations used by programmers too.
Cheers.
Comments
By Janne Johansson (130.237.95.193) jj@inet6.se on
I wasn't. I just thought out loud. It was an observation I made that it would be theoretically possible in this miniexample to find a spot (not necessarily _the_ spot) where such a test could be made.
By Stefan (131.188.24.2) on http://www.rommel.stw.uni-erlangen.de/~guebbbi/
int a=mul__size_t_size_t(b*c);
if (errno==OVERFLOW) {
//error handling
}
else {
//alloc
}
But since you cannot overload function or even overload operators this is a hassle, too.
Comments
By Anonymous Coward (70.179.123.124) on
#undef foo
foo()
{
/* function body */
}
#endif
Or am I totally wrong here?
Comments
By Anonymous Coward (72.138.211.134) on
By Roman (169.200.215.15) on
Someone please correct me if I am wrong.
By Grégoire (81.188.2.90) on
By veins (81.65.210.118) veins@skreel.org on http://lab.skreel.org/
I don't know if that's the bug, but there was an integer overflow a few years ago in OpenSSH hidden (caused in some sort) by ... a goto statement. It was hard to catch and very subtile, and in my opinion could be talked about as people tend to use gotos without too much thinking.
I like developers blog entries, I used to read undeadly once in a while (once every two/three months) and now I'm checking it many times a day, it makes the portal look more ... alive :-)
By Bernhard Leiner (128.130.39.94) on
Comments
By Simon (84.57.71.15) on
By Anonymous Coward (134.58.253.130) on
Comments
By Anonymous Coward (62.252.32.11) on
Comments
By Anonymous Coward (193.252.148.11) on
Comments
By Anonymous Coward (128.171.90.200) on
Is there a particular reason it didn't make it into libc in the first place ?
By Anonymous Coward (128.151.253.249) on
So... If libc and others use an "xrealloc" which takes 3 args and some other software uses one with 2 args... You can see where that might be problematic.
Comments
By Anonymous Coward (128.171.90.200) on
Comments
By Anonymous Coward (70.31.84.121) on
By Anonymous Coward (216.220.225.229) on
Comments
By Anonymous Coward (84.188.234.181) on
By Anonymous Coward (70.124.65.113) on
This point of view is unpopular with those who think it's a sign of weakness as a programmer, but you only have to read the bugtraq archives to see how well that philosophy works.
Comments
By Paladdin (213.97.233.52) on
Not to be interpreted to be a flame -please!- but I also consider that learning low level programming is a must for any wannabe programmer. It helps to develop machine-like concepts that makes coders more ease on resources an so on. This new trend of let's-go-code-like-we-speak, with incredible layers of abstraction, suposses a big expend on resources and, albeit new computer are powerful, that doesn't mean we have to be bad coders -yes, I make a difference between a bad programmer, and a bad coder-. Maybe I say this because I'm tired of look for .NET platform bugs these days... but I'm really dissapointed with modern heavily overengineered languages.
By Matthias Kilian (84.134.29.123) on
Java? Not free, not portable, incredibly bloated.
Haskell? Doesn't (yet) run on enough platforms. And it's a little bit bloated, too.
Perl, Python, Ruby, Lua etc.? May be, but they're scripted languages, are loosely type checked, and I want to be able to create BLOBS ;-)
Lisp or Scheme? Hmm. As with the above-mentioned scripted languages, there's static type checking missing.
YMMV.
Comments
By Anonymous Coward (70.124.65.113) on
By Anonymous Coward (71.255.101.25) on
By Ray (192.193.221.141) on http://cyth.net/~ray/
It’s not just a problem with C, though. There’s a certain mindset that is required to code securely. A person needs to think of the limitations of the system and figure out if there are any exceptional cases that need to be taken care of.
This code (translated to Java or whatever language) has bugs no matter what language you use. Is it a limitation of the language that the programmer didn't know about negative numbers?
Comments
By Anonymous Coward (70.124.65.113) on
But it's just a fact that some languages completely eliminate certain classes of errors (barring, of course, language implementation errors), and often provide better semantics for handling the errors that do occur. The most common result when something bad or unexpected happens in C is undefined behavior: the program will likely just keep running, only in an incorrect state. And the C standard is chock full of such corner cases, so many that even veteran C programmers slip up now and then (often with type conversions or pointer arithmetic).
For example, in Lisp an integer overflow attack is just plain impossible.
will either produce the proper sum, or signal an error if the operands have the wrong type (i.e. they're not numbers). In no case will you simply get the wrong answer. In the most extreme cases there's not enough memory to represent the sum, and again you get an error signaled.By Anonymous Coward (128.171.90.200) on
Comments
By Fábio Olivé Leite (15.227.249.72) on
Comments
By Anonymous Coward (128.171.90.200) on
If you had a loop that created a billion objects, you would be more likely hit an OutOfMemoryException and exit.
This is of course, unless you use JNI.
Comments
By Anonymous Coward (128.171.90.200) on
By Ray (199.67.140.83) ray@ on http://cyth.net/~ray/
So you’re saying that, in Java, running out of memory when a negative number is input is not a bug?
Comments
By Anonymous Coward (128.171.90.200) on
It is worth pointing out that by default, you will run out of memory in a Java aplication within 64Mb, the JVM will then kill the application, so the you can get a denial of service for the application. The possibility for exploiting this functionality is greatly reduced by comparison to applications that run natively.
By Anonymous Coward (71.201.247.141) on
By djm@ (203.217.30.86) on
We write operating system components, therefore we use C.
Comments
By Anonymous Coward (70.124.65.113) on
As to bugtraq, you only need to google for buffer/integer overflows, there's no shortage of examples. "PHP sucks too" isn't exactly a compelling defense of C's failings.
I don't know whether to find it amusing or sad that the same problems will probably still be around when 32-bit time_t overflows (how's that for irony).
Comments
By Anonymous Coward (70.179.123.124) on
Hell, what are other languages written in? I do believe that it's C. If people can't write C correctly, they can't create a "safe" language.
The argument you're making is pretty self-defeating; unless you can make perl/python/ruby/Visual Basic into a working interpreter.
Comments
By Anonymous Coward (128.171.90.200) on
By Anonymous Coward (70.31.84.121) on
Comments
By Anonymous Coward (128.171.90.200) on
Comments
By tedu (69.12.168.114) on
Comments
By Anonymous Coward (128.171.90.200) on
By Anonymous Coward (128.171.90.200) on
By Krunch (139.165.82.240) on http://krunch.be/
http://www.dwheeler.com/secure-programs/
By Anonymous Coward (128.151.92.148) on
size_t len = strlen(str);
char *buf = malloc(len+1);
if(len != SIZE_T_MAX) strcpy(buf, str); // this should be safe.
Comments
By Anonymous Coward (84.166.98.208) on
By tedu (71.139.182.193) on
Comments
By Anonymous Coward (128.151.92.148) on
By tedu (69.12.168.114) on
http://www.amazon.com/gp/product/0321335724/sr=8-1/qid=1144103973/ref=sr_1_1/002-7492060-9752839?%5Fencoding=UTF8
Comments
By Robert C. Seacord (209.195.148.248) rcs at cert dot org on http://www.cert.org/books/secure-coding/
>
> http://www.amazon.com/gp/product/0321335724/sr=8-1/qid=1144103973/ref=sr_1_1/002-7492060-9752839?%5Fencoding=UTF8
Thanks, it is probably the best book I've ever written. 8^)
BTW, if you are interested in this integer stuff the Integer Chapter from the book is available as a free download from http://www.cert.org/books/secure-coding/
and elsewhere.
We are also starting an effort to create a secure coding standard for C and C++. If you check out http://www.cert.org/secure-coding/ after September 1st we should have our Wiki up with some initial rules and recommendations.
rCs
By Siju Oommen George (59.93.43.15) sgeorge.ml@gmail.com on
Of course reading code alone doesn't help me at this point but explanations like this do real help. I am looking forward to find more of these type of articles on Undeadly :-) Once again Thanks a million:-) and Good Luck :-)
By Corentin (81.56.152.193) on
Comments
By Anonymous Coward (213.196.253.42) on
the signed/unsigned char types).
Comments
By Otto Moerbeek (213.84.84.111) otto@drijf.net on Otto Moerbeek
<p>
Taking shortcuts in this is dangarous.
By Anonymous Coward (63.237.125.201) on
Hey, do I smell a new project on the way?
OpenC?
lib-openc?
occ?
Comments
By Nate (65.95.124.5) on
The C language has already been extended with several OpenBSD-created functions, most of them are in-place within real Unix systems, only glibc ignores these enhancements.
kencc may some day replace the gcc used in OpenBSD, but that is a ways away unless you've got some giant corporate sponsorship up your sleeve.
By Anonymous Coward (67.149.82.140) on
Comments
By Sam Chill (68.53.209.134) on
By Anonymous Coward (128.171.90.200) on
By Anonymous Coward (70.179.123.124) on
There are too many (no pun intended) variables to make it a standard part of C; better to become aware of and avoid the problem as a programmer.
In a perfect world, yes a compiler could do so.
In a perfect world, the only function I would ever have to call is do_shit(stuff); and everything would be taken care of.
By Corentin (81.56.152.193) on
I do not think it is very easy to detect such a class of run-time errors statically (at compile time). It should be easier using dynamic analysis (with a tool such as Valgrind[1]).
[1] http://www.valgrind.orgComments
By Anonymous Coward (128.171.90.200) on
Comments
By Anonymous Coward (128.151.92.148) on
Valgrind is great. Back when I used Linux, it helped me out of numerous binds and bizarre memory corruption bugs. I'd like to see it on OpenBSD. Last time I checked it doesn't run on it; am I right that that is still the case?
Comments
By Corentin (81.56.152.193) on
Porting Valgrind to a new operating system and/or hardware architecture is a complex task.
By Anonymous Coward (70.31.84.121) on
By Andy (82.44.215.41) on
It would need to wrap specific operations that you were checking, hypothetically as shown below.. note it is not safe to plug in overflow detection using __asm__ following a multiplication operation because you cannot guarantee the flag state (because you can't guarantee what code is generated) between your C statement and the __asm__ block.. This would have to take the form of a patch to gcc and maybe binutils to specifically generate the overflow checks integrated with the compiler's object generation.
/* This feature does NOT exist in gcc (or any other known C compiler) so don't bother trying it */
#pragma OVERFLOW_HANDLER_ON
a *= b;
#pragma OVERFLOW_HANDLER_OFF
On x86 this could generate the JO (Jump on Overflow) instruction... potentially like this..
(a *= b;)
MUL EBX
JO OverFlowHandler
most architectures do support overflow checks by a flag in this manner. This, however, would add a lot of additional and stuff to the C implementation that has not been ratified as a standard so is quite undesirable from that point of view. - after all if the feature doesn't exist do you forget about the necessary checks on all the systems that do not support it?? I think not - this ends up fixing the issue on some systems and not on others and an issue fixed on the primary system can be forgotten but nevertheless become a vulnerability and source of vector for other systems. - think like a 'cracker' your best birthday present is a vulnerability that people think is fixed but keeps cropping up because of implementation specifics.
Furthermore implementing overflow detection as a default on such operations is definitely a no-no because some code specifically requires overflow conditions and discards overflowed information. Some checksumming functions for example.
Comments
By Anonymous Coward (128.171.90.200) on
Secondly it IS possible to do this in a portable way, we don't need to rely on a processor instruction, as we are wrapping, a bit of programming logic will suffice. This is how ProPolice works.
Comments
By Andy (82.44.215.41) on
Maybe you missed my major objection to this. If our developers originate a project and deal with potential exploits in this manner it buries the vulnerability.
Comments
By Anonymous Coward (128.171.90.200) on
Anyway ..
Would it bury the vulnerability ? or would it highlight it ?
By Anonymous Coward (68.238.226.49) on
Comments
By Andy (82.44.215.41) on
signed short safemul(signed short x, signed short y)
{
signed short r;
signed short rr,xx,yy;
r=x*y;
/* Make negatives positive for overflow check */
if (0>x) xx=-x; else xx=x;
if (0>y) yy=-y; else yy=y;
rr=xx*yy;
if ( rr < yy || rr < xx ) {
printf("Overflow occured\n");
} else {
printf("No overflow occured\n");
}
return(r);
}
Comments
By Andy (82.44.215.41) on
By Anonymous Coward (72.154.140.130) on
signed short safemul(signed short x, signed short y) {
signed short r = x * y;
if(r / x != y || r / y != x) {
puts("Overflow occured");
} else {
puts("No overflow occured");
}
return r;
}
Comments
By Anonymous Coward (128.171.90.200) on
I think the best you can do is issue a warning.
By Willy Wombat (195.8.190.164) on
if (num && size && SIZE_T_MAX / num < size) {
errno = ENOMEM;
return (NULL);
}
num has to be tested to prevent possible division by zero.
Is size checked for zero just to avoid the (relatively) expensive division and comparison (since if size is zero, the multiplication won't overflow)?
Comments
By Anonymous Coward (195.68.31.231) on
if num equals zero, the if condition is false.
By tedu (71.139.182.193) on
Comments
By Otto Moerbeek (62.140.137.125) otto@drijf.net on Otto Moerbeek
POSIX marks this as undefined (either return NULL ora unique pointer), but I think it is wise to have malloc and calloc act the same when trying to allocate 0 bytes.
Comments
By Otto Moerbeek (213.84.84.111) otto@drijf.net on
By djm@ (203.217.30.86) on
Comments
By Anonymous Coward (66.11.66.41) on
By paldium (84.129.228.47) on
Although I have a question ...
If I understand this issue the problem arises when the multiplicator (e.g. size) gets out of control. So is there a problem with a code that takes for example the size_t'd variable, increments, and calls realloc then (i.e. step by step allocating)? Running such a program would nearly allocate 4 gb of memory (if I am correct) and that's much more my system could handle (also the default ulimit would prevent that).
An example of this would be strfile (src/games/fortune/strfile/strfile.c), which uses a variable of type long for the counter (signed, but I would have to use more bytes I could allocate with malloc). Is this approach "safe" enough?
Comments
By Ray (199.67.140.83) ray@cyth.net on http://cyth.net/~ray/
The malloc operations in strfile.c look pretty bad. Please, nobody code like that!
Incremental allocation as you described would be very inefficient and certainly doesn’t solve any problems.
Comments
By Ray (199.67.138.83) ray@cyth.net on http://cyth.net/~ray/
To further explain, incremental allocation won’t solve anything because if the space you are trying to allocate is too big (due to a miscalculation) it will still run out of memory. It will just happen much slower because you keep calling realloc. (Be aware that if realloc cannot expand the currently allocated space to the new size, realloc will allocate a new block of memory and copy the old contents there. In addition to being slow, I think memory gets fragmented over time (though now that our malloc uses mmap, that may not be the case). Now imagine trying to allocate 4 GB of memory, in 1 KB increments.)
If the space you are trying to allocate is too small (due to an overflow) and you successfully allocate it, you’re still screwed the moment you try to access the memory you thought you allocated.
Comments
By paldium (84.129.196.237) on
Thank you very much for your responses (and your article of course)!
By Andrew Reilly (141.168.4.160) on
The main problem is that multiplication naturally requires type expansion, but C doesn't work that way. Many processors do allow expanding multiplies, though, and their C compiler will recognize the idiom (for example) int64_t result = (int64_t) a * (int64_t) b; where a and b are int32_t, for example, and generate the single 32 x 32 -> 64 multiply instruction that will produce the right result.
Doing an expanding multiply and then checking to see if the result will still fit into a size_t will almost certainly be faster than the division-based test in the example.
There is less complete coverage for 64 x 64 --> 128 bit multiplies in most processors, but a bit of code to do that using 32 x 32 --> 64 multiplies will still probably be faster than a division.
Of course, in the case of something like OpenSSL or OpenSSH, you have to ask if there weren't, perhaps, low-cost sanity checks that could be made on the arguments before the multiply? I doubt that either really has to deal with a vast number of memory-busting objects, such that there would be any fear of this overflow even being approached.
Just another note: C defines unsigned types (size_t, usually) to wrap around on overflow, but signed types have undefined behaviour on overflow, and some processors (most) allow signed integer overflows to be trapped or tested. That doesn't help you much in C, but other languages, like Ada, let you catch those.
Comments
By tedu (71.139.182.193) on