Question on Unsafe functions

0 Clark Alaan · April 13, 2015
As the title says,

I am aware that a lot of string functions are actually unsafe and can cause buffer overflows. While puts() is simply a function that prints strings up out into the screen, is it at least as safe as printf()?

Also, malloc(). What will happen if you try and allocate more memory than the heap can provide? Is there anyway to prevent this from ever happening?

Thanks.




On an unrelated to title question,(Not necessarily looking for a reply)
In tutorial 47, Bucky typecasted (int *) to malloc. Is this actually necessary for that case since the we know that we're using int? I think I remember reading somewhere that there are cases where you don't necessarily have to typecast a malloc. Or is it simply just good practice to do so while we're just learning C?

Again, thanks.

Post a Reply

Replies

Oldest  Newest  Rating
0 http://coding.developer.se/   · April 20, 2015
Happy to be of service.
0 Clark Alaan · April 19, 2015
Thank you for an amazingly detailed reply coding.developer.se. Unfortunately, I still lack the logic to fully understand the information you have given but I am sure that this will be helpful. Might take a while to study(I'll have to go through it a couple more times I guess), but I can see how informative the reply is. Again, thank you.
+1 http://coding.developer.se/   · April 16, 2015
Your program should never crash. Period. Crashing is a bad sign. Please read the entirety of this post, because it seems like I'm repeating myself a lot. I've been programming in C for over ten years now, and it'd be wise to listen to my advice.

Yes, the user input is an unrealiable device. The user might enter something non-numeric when scanf is told to expect numeric data. That's what the return value for scanf is for. If you ignore the return value for scanf (as I'm sure I've told you not to), then your program might crash. Don't ignore the return value for scanf.

As for super long strings, there are strategies to deal with those, too. If we want to put a limit on the number of characters in a string, we can use a field width in scanf, or we can use fgets. If the user enters input that's too long, we can warn the user that the input is too long and has been truncated, which we can do using a function like this:

void seek_end_of_line(FILE *f) {
scanf("%*[^\n]");
getchar();
}



We can use the same "truncation" technique for invalid integer input. For example:


int x;
if (scanf("%d", &x) != 1) {
puts("Invalid integer input. Discarding the remainder of the line.");
seek_end_of_line(stdin);
}

char a_really_long_word[128];
if (scanf("%127s", a_really_long_word) != 1) {
puts("EOF encountered... Exiting.");
exit(0);
}
seek_end_of_line(stdin); // scanf leaves the '\n' on the stream.

char a_really_long_line[4096];
if (fgets(a_really_long_line, sizeof a_really_long_line, stdin) == NULL) {
puts("EOF encountered... Exiting.");
exit(0);
}
if (a_really_long_line[strcspn(a_really_long_line, "\n")] == '\0') {
puts("User input too long... Truncating.");
seek_end_of_line(stdin);
}


Note the lack of overflows, and the impossibility for uninitialised variables to be used. I haven't tested this code, but I've written code following this pattern enough times to be confident that this code is stable.

As user input is unreliable, we may desire to handle things differently than just exiting. This kind of pattern allows us to do so.

Supposing you want the user to be able to enter strings that are longer, but you're not sure how long... Indeed, they could enter something of infinite size for all you know. This complicates things quite a bit, but that doesn't mean your program needs to be necessarily unstable, nor does it mean you need to use a ridiculously sized buffer for small inputs.

You can use realloc to incrementally resize your buffer... Using a power of two is intuitive, because you don't need to store the total capacity of your array. All you need to know is, if you reach a power of two you need to resize. Powers of two are: 0, 1, 2, 4, 8, 16, 32, 64, ...

At this stage I would develop a function to realloc based on powers of two, and then develop an fgets alternative which reallocates the buffer using that function.

void *b2_realloc(void *p, size_t element_count, size_t element_size) {
return (element_count & (element_count + 1)) == 0) // Check if count is power of two
? realloc(p, element_size * (element_count * 2 + 1)) // realloc if it is
: p; // leave it otherwise
}

char *b2_fgets(FILE *f) {
char *buffer = NULL;
size_t buffer_size = 0;

for (;;) {
void *temp = b2_realloc(buffer, buffer_size, 1);
if (temp == NULL) {
puts("Allocation failure... Exiting.");
exit(0);
}
buffer = temp;

int c = getchar();
if (c < 0) {
puts("EOF encountered... Exiting.");
exit(0);
}

if (c == '\n') {
break;
}
buffer[buffer_size++] = c;
}

buffer[buffer_size] = '\0';
return buffer;
}


Again, I haven't tested this code, but I do use something along these lines quite often in my line of work... Feel free to use it in yours. Hopefully your programs will become more stable.
0 Dol Lod · April 15, 2015
Sure, and if we happen to accidentally drink some potassium lye (drain cleaner), there's no point stopping since who knows what else could happen? Just keep chuggin' that poison! Otherwise, might not hallucinate and vomit as much 

At the end of the day, the user could enter integers, super long strings etc. ... To clarify, make it so only the user can mess up the program. Your program should never crash b/c of code you wrote with valid input. This is also how FSMs work with automatons, if unexpected input happens, just crash or let the machine fail. If the user fails to supply good input, the program should fail or crash. I am not saying to avoid fixing bugs in the code that could happen given valid input, merely to not bother fixing bugs given invalid input.
0 http://coding.developer.se/   · April 14, 2015

I am aware that a lot of string functions are actually unsafe and can cause buffer overflows. While puts() is simply a function that prints strings up out into the screen, is it at least as safe as printf()?



An electrical appliance might be considered safe until it's dropped into a bathtub. None of the functions you've mentioned are unsafe. It's the way that people (mis)use them that makes them unsafe.

puts expects that the argument supplied points to a string. So long as you use it for what it's designed for, it will be safe. Similarly, a toaster expects to be operated in a dry area. If you try to supply something that doesn't point to a string to puts, or you try to operate a toaster in the bathtub, strange things may happen.

There have been reports of crazy couples holding frayed electrical wire connected (and live) to the mains and kissing to experience an "electrical kiss". I guess undefined behaviour can be pleasant, but it's not wise to rely upon that.

Also, malloc(). What will happen if you try and allocate more memory than the heap can provide? Is there anyway to prevent this from ever happening? 



If malloc fails, it is contractually obliged to return NULL. If that makes malloc unsafe, then your GPS is unsafe because when it fails you'll see nothing. Bugs aside; they're not in the contract, remember... They usually get fixed.

Unless you have some kind of computer with infinite resources, you can't prevent this from happening, no. Generally it doesn't become a problem for those of us who are seasoned, because we design software to use as few resources as possible rather than trying to allocate and duplicate a universe of data.

In tutorial 47, Bucky typecasted (int *) to malloc. Is this actually necessary for that case since the we know that we're using int? I think I remember reading somewhere that there are cases where you don't necessarily have to typecast a malloc. Or is it simply just good practice to do so while we're just learning C?



No; you   don't   cast the result, since: ...   http://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc/605858#605858

puts() only outputs a string to stdout whereas printf() has a vulnerability with the %n format. the buffer overflow you're thinking about is concerned with writing data into memory such as gets(), scanf() and printf() with %n.



The only vulnerable function there is gets. As far as the others go, it's all down to how you use them. If I attempt to weld a submarine together with the wrong equipment and processes, the submarine will be vulnerable... Especially if I'm the one doing the welding.

Don't even try to fix it and if there is undefined behaviour, let it be undefined since who knows what else could happen.



Sure, and if we happen to accidentally drink some potassium lye (drain cleaner), there's no point stopping since who knows what else could happen? Just keep chuggin' that poison! Otherwise, might not hallucinate and vomit as much :(

Make sure to allocate a huge array of characters to input data into.



Actually, I would prefer to allocate the precise amount of characters necessary. For user input that might not be necessarily available. There are other algorithms that don't involve allocating huge chunks of memory unnecessarily...

it would simply return NULL so when you dereference it, you would get an error.



Not all null pointer dereferences result in an obvious error. Some are far more deviant, and buffer-overflow like in potential.
0 Dol Lod · April 13, 2015
Fundamentally, there is no way to handle insane input. 

Temporary workarounds you could issue are to either 

1) Don't even try to fix it and if there is undefined behaviour, let it be undefined since who knows what else could happen.

2) Make sure to allocate a huge array of characters to input data into. Then grab as many characters as you want. Ex. if you want 12 characters, maybe allocate an array of 512 characters instead and then read the data in. 

Regarding what happens with malloc, if there isn't enough memory, it would simply return NULL so when you dereference it, you would get an error. 
0 c student · April 13, 2015
puts() only outputs a string to stdout whereas printf() has a vulnerability with the %n format. the buffer overflow you're thinking about is concerned with writing data into memory such as gets(), scanf() and printf() with %n.

using malloc() resulting in the heap clashing with normal program data might simply cause a failed call to malloc() or it might crash your program.  i'm not entirely sure but i'm assuming it would also be concerned with the os.

as for typecasting malloc() to an int, it is redundant.

you might like to read the man page for malloc() found here: http://linux.die.net/man/3/malloc
  • 1

C

107,188 followers
About

One of the most popular languages of all time.

Links
Moderators
Bucky Roberts Administrator