Contributed by marco on from the ulrich-drepper-likes-memcpy dept.
Today ray@ will be talking about proper string buffer handling with strlcpy(3), strlcat(3), and snprintf(3). Many people now know to avoid the unbounded string functions (strcpy(3), strcat(3), and sprintf(3)), which are prone to buffer overflows:
char newstr[9]; const char *oldstr = "rm -rf /home/ray/tmp/*"; /* Buffer overflow in all cases. */ strcpy(newstr, oldstr); strcat(newstr, oldstr); sprintf(newstr, oldstr);
The bounded string functions take a buffer size argument and never write past that boundary, preventing buffer overflows. For arrays, the buffer size can be calculated with the sizeof operator. For malloc(3) allocated buffers, the size parameter given to malloc(3) can be used as the buffer size. The following examples use the sizeof idiom, but each sizeof instance can be replaced with the buffer size.
Note: the sizeof operator does not work with arrays passed as function arguments:
char real_array[BUFSIZ]; void function(char array[], char array2[BUFSIZ], chat *ptr) { /* sizeof(real_array) != sizeof(array) */ /* sizeof(real_array) != sizeof(array2) */ /* sizeof(real_array) != sizeof(ptr) */ } int main(int argc, char *argv[]) { function(real_array, real_array, real_array); return (0); }
In the example above, a pointer to the array is passed around, not the array. Thus the sizeof operator returns the pointer size instead of the array size.
The bounded variants of the previous three string functions are strlcpy(3), strlcat(3), and snprintf(3). These functions truncate the resulting string as necessary, preventing overflow:
char newstr[9]; const char *oldstr = "rm -rf /home/ray/tmp/*"; /* Truncation in all cases. */ strlcpy(newstr, oldstr, sizeof(newstr)); strlcat(newstr, oldstr, sizeof(newstr)); snprintf(newstr, sizeof(newstr), oldstr));
strncpy(3) and strncat(3) are also bounded functions, but they suffer from an unwieldy API. This is how to properly use strncpy(3) and strncat(3) to always create NUL-terminated strings and to never overflow:
char newstr[9]; const char *oldstr = "rm -rf /home/ray/tmp/*"; /* Truncation in all cases. */ strncpy(newstr, oldstr, sizeof(newstr) - 1); newstr[sizeof(newstr) - 1] = '\0'; strncat(newstr, oldstr, sizeof(newstr) - 1 - strlen(newstr));
Compare the above to strlcpy(3) and strlcat(3). The strncpy(3) and strncat(3) functions are complicated, error-prone, and strongly discouraged.
Now buffer overflows are prevented, but a new problem arises: truncation. Undetected truncation can be deadly: do you really want to execute the truncated string produced above? To aid truncation detection, these functions return the resulting string length as if truncation did not occur. If this value is greater than or equal to the destination buffer size, truncation has occurred:
char newstr[9]; const char *oldstr = "rm -rf /home/ray/tmp/*"; /* Detected truncation in all cases. */ if (strlcpy(newstr, oldstr, sizeof(newstr)) >= sizeof(newstr)) warnx("truncation"); if (strlcat(newstr, oldstr, sizeof(newstr)) >= sizeof(newstr)) warnx("truncation"); if (snprintf(newstr, sizeof(newstr), oldstr) >= sizeof(newstr)) warnx("truncation");
snprintf(3) comes from the printf(3) family and inherits all its quirks. Because of this inheritance snprintf(3) needs more than just truncation checks. For example, it interprets oldstr as a format string. Percentage signs (%) in the string can affect the result:
char newstr[9]; const char *oldstr = "100%off!!!"; /* Format string error. */ if (snprintf(newstr, sizeof(newstr), oldstr) >= sizeof(newstr)) warnx("truncation");
To prevent this, always escape strings with %s. This rule applies to all printf(3) functions:
char newstr[9]; const char *oldstr = "100%off!!!"; /* Format string error. */ if (snprintf(newstr, sizeof(newstr), "%s", oldstr) >= sizeof(newstr)) warnx("truncation");
Additionally, the printf(3) family will return a negative value if there is an error and must be tested:
int i; char newstr[9]; const char *oldstr = "100%off!!!"; /* Format string error. */ i = snprintf(newstr, sizeof(newstr), "%s", oldstr); if (i < 0 || i >= sizeof(newstr)) warnx("snprintf");
The above example demonstrates snprintf(3)'s last quirk: it takes a size_t for the buffer size but returns an int. This means that truncation detection involves comparing an int to a size_t. This can be avoided by using strlcpy(3) and strlcat(3), if your format string does nothing but string concatenation.
Here's a recap of correct string usage:
int i; char newstr[9]; const char *oldstr = "100%off!!!"; /* Detect all errors. */ if (strlcpy(newstr, oldstr, sizeof(newstr)) >= sizeof(newstr)) warnx("truncation"); if (strlcat(newstr, oldstr, sizeof(newstr)) >= sizeof(newstr)) warnx("truncation"); i = snprintf(newstr, sizeof(newstr), "%s", oldstr); if (i < 0 || i >= sizeof(newstr)) warnx("snprintf");
I hope that this has been helpful, but before submitting any patches please be sure they are correct and solve actual problems. Thanks!
(Comments are closed)
By Anonymous Coward (82.43.92.127) on
By Niall O'Higgins (83.147.128.114) niallo@openbsd.org on
Comments
By rmg (208.181.115.2) on
Use of strlcpy and friends is merely convention, and *BSD specific convention at that, I believe.
Comments
By Nate (65.95.124.5) on
Comments
By rmg (208.181.115.2) on
My point was really that they aren't part of the standard C library (talking standard here, not implementation).
Comments
By Anonymous Coward (70.27.15.123) on
Comments
By m0rf (68.104.1.58) on
Comments
By Anonymous Coward (71.255.98.251) on
By rmg (208.181.115.2) on
Comments
By Anonymous Coward (70.27.15.123) on
Comments
By rmg (208.181.115.2) on
Comments
By Anonymous Coward (70.27.15.123) on
Comments
By rmg (208.181.115.2) on
if they didn't care, they would not have released the portable version, and the software would be non-portable. but they did, so...?
my comments make a lot more sense if you don't assume they are an attack.
Comments
By Anonymous Coward (66.11.66.41) on
By Nate (65.95.124.5) on
Comments
By Anonymous Coward (67.64.89.177) on
Heil Drepper!
Comments
By Anonymous Coward (66.39.191.42) on
By m0rf (68.104.1.58) on
http://lists.debian.org/debian-devel/2002/03/msg00295.html
of course application developers can add the 3 files needed to get the functions on gnu/microsoft systems or anywhere else they aren't supported (like OpenSSH-portable does).
Comments
By Nony mouse Coward (128.171.90.200) on
The post following that suggests correct string handling does not mean checking return values, or that the poster hasn't looked at the API.
None of the arguments they presented make any sense.
Life sure is scary in GNU land.
Comments
By Anonymous Coward (67.64.89.177) on
By Anonymous Coward (24.34.57.27) on
Comments
By Anonymous Coward (67.64.89.177) on
By Chris (80.176.91.102) chriswareham@chriswareham.demon.co.uk on www.chriswareham.demon.co.uk
By Anonymous Coward (64.80.197.181) on
Thanks Ullrich. You sure make me glad I don't use Linux.
By Anonymous Coward (128.171.90.200) on
*((char *) mempcpy (dst, src, n)) = '\0';
Comments
By veins (193.251.36.113) veins@skreel.org on http://lab.skreel.org/
> *((char *) mempcpy (dst, src, n)) = '\0';
>
Well, I may be very tired, but this looks like the following code written in a more understandable way:
{
char *p;
p = memcpy(dst, src, n);
*P = '\0';
}
and since memcpy returns the address of dst, it looks like his example actually achieves this:
{
*dst = '\0';
}
So ... either it's me, or Drepper's one-liner example of how people should nul-terminate a string instead of using inefficient BSD crap, is broken.
Comments
By veins (193.251.36.113) veins@skreel.org on http://lab.skreel.org/
> *p = '\0';
By Sebastian (84.172.38.148) on
Comments
By Sebastian (84.172.38.148) on
Comments
By veins (193.251.36.113) veins@skreel.org on http://lab.skreel.org/
/me puts two fingers in throat out of disgust
Comments
By veins (193.251.36.113) veins@skreel.org on http://lab.skreel.org/
Anyways, just reading this makes it obvious he does not know what he is talking about, had he read the paper, he'd know how you can catch truncations.
Comments
By Anonymous Coward (128.171.90.200) on
By Anonymous Coward (70.27.15.123) on
Comments
By rmg (208.181.115.2) on
And, to avoid being modded down again for not stating the obvious:
- strl* are better than strn*
- strl* is not part of standard C
- strl* SHOULD be part of standard C
- if you don't have strl* in your libc, add it to your project
...and I thought adding to the article was better than repeating it, sheesh..
Comments
By m0rf (68.104.1.58) on
Comments
By Anonymous Coward (210.233.106.4) on
Comments
By Anonymous Coward (68.104.1.58) on
By Couderc (212.234.204.97) on
They already provided a draft (N1135) which contains what should be the future ISO C implementations of those safe functions :
"Programming languages, their environments and system software interfaces — Specification for Safer, More Secure C Library Functions"
From the scope section :
"This Technical Report specifies a series of extensions of the programming language C, specified by International Standard ISO/IEC 9899:1999."
I have already started to implement some of them just for fun :)
Comments
By Anonymous Coward (62.49.147.46) on
By Anonymous Coward (64.62.167.198) on
And anybody who thinks that if/when these debut and become widely available ten years from now anybody will bother going over old code to do a simple search + replace (which is the only redeemably quality they have), is an idiot.
strlcpy() and strlcat() have their problems, but they have unique mix of compatibility, convention, and sane (if not elegant) semantics.
We need more interfaces like strlcpy()/strlcat(). gget_s() (from N1135) is too little, and all the over-the-top libraries which implement string "object" in C (bstr, postfix "records" etc), go way too far and are unwieldly.
Comments
By Couderc (212.234.204.97) on
This specification is next to useless. They take most of the functions that almost nobody uses or should use, and tweak them the slightest bit so your average programmer has almost no incentive to use them.
Maybe you're right, but where were you when those extensions were discussed by the working group ? By this time, I sent an alert to millert@ about glibc evangelists that were spreading their ideas there.
And anybody who thinks that if/when these debut and become widely available ten years from now anybody will bother going over old code to do a simple search + replace (which is the only redeemably quality they have), is an idiot.
Yes but future developpement will sure follow C standard for the sake of portability instead of functions that are not part of this standard.
strlcpy() and strlcat() have their problems, but they have unique mix of compatibility, convention, and sane (if not elegant) semantics.
You do not need to try to convert me, i am already since years. I use those functions in my projects but for this i have to provide a compatibility layer to use non standard functions.
We need more interfaces like strlcpy()/strlcat(). gget_s() (from N1135) is too little, and all the over-the-top libraries which implement string "object" in C (bstr, postfix "records" etc), go way too far and are unwieldly.
Yes but gget_s is part of future standard even if it does not please to you. There are other systems than BSD in real world you know, and even people who believe in portable code over unix systems.
By Darren Tucker (203.217.17.96) dtucker@zip.com.au on
Actually that's not quite true, but if you know that then you should also know when :-)
(Hint: for example, utmp and wtmp structs are nul-padded but not nul-terminated.)
Comments
By Anonymous Coward (66.11.66.41) on
Comments
By Anonymous Coward (128.151.92.52) on
By Mr Strncpy (83.70.176.191) on
I agree in general though, strncpy is misnamed and in the new world order it should be renamed str2buf since after strncpy you are not guaranteed to end up with a string as I think by definition, a string is a nul terminated character array.
char key[16];
printf("key \"%.*s\"\n", sizeof(key), strncpy(key, optarg, sizeof(key)));
By tedu (69.12.168.114) on
By Anonymous Coward (71.140.143.217) on
char buf[100];
strncpy(buf, src, sizeof(buf));
Note sizeof(buf) instead of sizeof(buf) - 1. If src is at least as long as buf, this means buf won't be properly NULL terminated because strncpy will use up all the bytes if needed. Contrast this to what fgets() and strlcpy() do. When told that the buffer size is n bytes, they both attempt to read up to n-1 bytes and properly NULL terminate the buffer.
Comments
By Anonymous Coward (71.140.143.217) on
Comments
By Niall O'Higgins (193.1.172.166) niallo@openbsd.org on
By Anonymous Coward (70.74.75.200) on
This reminds me of this caution about the mistakes made using sizeof.
http://marc.theaimsgroup.com/?t=110859539500001&r=1&w=2
Comments
By Matthias Kilian (84.134.9.253) on
Ugly. Disgusting. Frustrating.
By Anonymous Coward (141.100.40.69) on
Comments
By Anonymous Coward (141.100.40.69) on
By tedu (69.12.168.114) on
so some committee had to come up with _s[hitty] versions of every libc function? waste of time.
Comments
By Anonymous Coward (128.151.253.249) on
It appears the difference between fopen_s() and fopen() is that it checks for NULL pointers. But the document itself says that fopen() with NULL arguments is "undefined" behavior. Which means that a libc implementation of fopen() can do the NULL check and still be conforming.
And what's with returning errno_t everywhere?
Comments
By Anonymous Coward (128.151.253.249) on
Comments
By rmg (208.181.115.2) on
Comments
By Anonymous Coward (84.172.38.148) on
Thanks for your reply!
Comments
By rmg (208.181.115.2) on
here's the first hit from google (errno reentrance):
http://docs.sun.com/app/docs/doc/805-5080/6j4q7emi5?a=view
enjoy :-)
By Anonymous Coward (74.62.215.125) on
strlcpy is a replacement for strNcpy (N! see it?). Not for strcpy. His claim that strlcpy will encourage people to make poor fixes for strcpy code is wrong, since glibc already allows even *worse* fixes by changing the code to use strncpy. And it is blatently obvious that in 100% of use cases strlcpy is better than strncpy.
He also repeatedly proposes snprintf(dst,size,"%s",src) as a replacement that is somehow "superior". How is it superior, when it is EXACTLY the same in the results and return value, except for being about 10 times slower, and harder to read?
I am completely flabbergasted that somebody with this level of cluenessess can get in this position of influence in gnu.
PS: I am a Linux software developer and don't actually use BSD at all, but I can certainly tell when somebody is being an idiot.
Comments
By Anonymous Coward (76.234.103.166) on
Even worse, some versions of snprintf()--glibc included I believe--can fail at run time (w/ -1, or SIZE_MAX, depending on how you handle it). Some snprintf's use dynamically allocated internal buffers, or allow dynamic type specifiers, either of which can make things blow up unexpectedly, no matter that your format string and arguments are perfectly valid.
So, not only is snprintf() slower, it requires more error logic (in addition to Win32 work-arounds, if you're into that).