Contributed by Ray Lai on from the line-after-line dept.
char buf[1024]; fgets(buf, sizeof(buf), fp); /* Nuke newline. */ buf[strlen(buf) - 1] = '\0';
fgets(3) includes a newline character from each line, but only if one was found. This is why most code truncates the string after calling fgets(3). There are three things wrong with this example:
1. fgets(3) can fail and return NULL.
2. buf[strlen(buf) - 1] might not be a newline,
truncating a valid character.
This can happen if the file did not end in a newline,
if the buffer was too small to store the line, or
3. buf might contain binary data and start with a NUL character,
causing strlen(buf) to be zero.
This, in turn, causes an out-of-bounds write.
The following example, copied directly from the man page, checks for all three errors:
char buf[1024]; if (fgets(buf, sizeof(buf), fp) != NULL) { if (buf[0] != '\0' && buf[strlen(buf) - 1] == '\n') buf[strlen(buf) - 1] = '\0'; }
Note that lines containing NUL characters cannot be read reliably with fgets(3), since we are using strlen(3) to check the line length.
Some of these issues can be avoided by using fgetln(3). fgetln(3) allows for arbitrarily long lines and returns the number of characters read. Like fgets(3), newline characters must be removed manually; while this is easier to do with fgetln(3) because the length is returned, a check must still be performed.
Here is an example, copied directly from the man page:
char *buf, *lbuf; size_t len; lbuf = NULL; while ((buf = fgetln(fp, &len))) { if (buf[len - 1] == '\n') buf[len - 1] = '\0'; else { /* EOF without EOL, copy and add the NUL */ if ((lbuf = malloc(len + 1)) == NULL) err(1, NULL); memcpy(lbuf, buf, len); lbuf[len] = '\0'; buf = lbuf; } printf("%s\n", buf); } free(lbuf);
A few things to note here:
1. There is no strlen(3) call,
so the only thing affected by NUL characters is printf(3).
2. fgetln(3) cannot return successfully with a zero-length string,
so the buffer cannot be accessed with a negative index.
3. The free(3) is correctly placed,
since it is only possible to call malloc(3) in the last iteration.
For a higher level of abstraction and more knobs, check out fparseln(3), which takes care of newline characters automatically and can recognize escape characters, comment characters, and continuation characters. Unfortunately, fparseln(3) has more overhead, is less portable, and each line must be freed afterwards. It is also impossible to check if the file ends in a newline; to reliably read every character in a file, use fgetln(3).
For more information, read the man pages: fgets(3), fgetln(3), and fparseln(3). Pay attention to the CAVEATS sections for fgets(3) and fgetln(3).
(Comments are closed)
By Anonymous Coward (84.9.187.181) on
Should no doubt be:
...while this is easier to do with fgetln(3)...
Comments
By Ray Lai (85.25.4.93) ray@ on http://cyth.net/~ray/
>
> Should no doubt be:
>
> ...while this is easier to do with fgetln(3)...
Yes, all the fget* typing confused me.
By Tony (12.38.8.233) on
Or do I just misunderstand what fgetln is doing behind the scenes?
Comments
By Tony (12.38.8.233) on
>
> Or do I just misunderstand what fgetln is doing behind the scenes?
OK, I'm an idiot, ignore me. Just re-read the man page for fgetln. For those wondering, fgetln hands you a memory location from the stream you've passed it. Closing that stream will free that memory, so no need to free buf. In fact, doing so would probably cause bad things to happen.
By tedu (69.12.168.114) on
By Anonymous Coward (70.49.119.203) on
Why not just start using something better? Write better libraries. Even if they're not standard, so long as it's in the base system, who cares? Software with very impressive security histories (i.e. Postfix) don't expend the trouble trying to be extra careful about these shakey foundations -- they build new ones, better ones.
If OpenBSD really cared about proper usage, why wouldn't they go about introducing abstractions and interfaces that are better. The strl* functions don't count. They were a good, very trivial first step, but there's so much more that could be done. Why not go further?
Comments
By Anonymous Coward (69.207.171.114) on
By Anonymous Coward (68.104.220.48) on
>
> If OpenBSD really cared about proper usage, why wouldn't they go about introducing abstractions and interfaces that are better. The strl* functions don't count. They were a good, very trivial first step, but there's so much more that could be done. Why not go further?
Good question. Why don't you start and submit some code?
By Ray Lai (83.125.32.107) ray@ on http://cyth.net/~ray/
I wrote this because people are still using these functions and doing so incorrectly, not to proclaim great joy and self-fulfillment.
> Why not just start using something better? Write better libraries. Even if they're not standard, so long as it's in the base system, who cares? Software with very impressive security histories (i.e. Postfix) don't expend the trouble trying to be extra careful about these shakey foundations -- they build new ones, better ones.
> If OpenBSD really cared about proper usage, why wouldn't they go about introducing abstractions and interfaces that are better. The strl* functions don't count. They were a good, very trivial first step, but there's so much more that could be done. Why not go further?
I agree that nicer, safer interfaces would be nice. The reason they do not exist is not because I prefer functions full of caveats.
By Anonymous Coward (69.207.171.114) on
Comments
By Eric Eells (67.10.75.125) eric_eells@yahoo.com on
Comments
By Eric Eells (67.10.75.125) on
By Anonymous Coward (66.219.146.142) on
>
> char buf[SIZE];
> FILE *f = whatever ;
>
> if (fgets(buf, sizeof(buf), f))
> {
> char *p;
> if (p=strchr(buf, '\n'))
> *p = 0;
> }
>
> Unless there is something I'm not seeing here, this should work fine and seems pretty reasonable to me. It is shorter and IMO more intuitive than the examples given.
strchr assumes a nil-terminated string; if there are embedded nil's, it will get fucked up.