I am looking to learn best practices for working with strings. Two issues that I have with strncat are that it is difficult to calculate how much space to allow, and that there may not be a null character in the string after the procedure.

So if I have two strings of unknown length, would this be the best way to concatenate them? Are there better ways?

strncat(str1, str2, sizeof(str1)-strlen(str2));
if( strlen(str1)==sizeof(str1) )
    str1[sizeof(str1)-1]="\0";

Thanks!

I am looking to learn best practices for working with strings. Two issues that I have with strncat are that it is difficult to calculate how much space to allow,

Why? strlen(str1) + n + 1 is exactly how much space is needed.

and that there may not be a null character in the string after the procedure.

Better read up on strncat() again. You are wrong.

Why? strlen(str1) + n + 1 is exactly how much space is needed.

What is n? Is it strlen(str2)?

The issue is that I cannot be certain that the length of str1 is enough to hold both str1 and str2. Such as:

char str1[16]="Have a very ";
char str2[16]="nice day!";

Better read up on strncat() again. You are wrong.

From this page:
http://www.cprogramming.com/tutorial/secure.html
"You should be aware that strncpy will not automatically append a null terminator"

What is n? Is it strlen(str2)?

strncpy(str1,str2,n) -- definition

The issue is that I cannot be certain that the length of str1 is enough to hold both str1 and str2. Such as:

char str1[16]="Have a very ";
char str2[16]="nice day!";

What would sizeof(str1) give you?
What would strlen(str1) give you?
What's left?

From this page:
http://www.cprogramming.com/tutorial/secure.html
"You should be aware that strncpy will not automatically append a null terminator"

From http://www.cplusplus.com/reference/clibrary/cstring/strncat/:
Appends the first num characters of source to destination, plus a terminating null-character.

http://en.wikipedia.org/wiki/Strcat#strncat
The most common bounded variant, strncat, only appends a specified number of bytes, plus a NULL byte.

http://www.elook.org/programming/c/strncat.html
The function strncat() concatenates at most count characters of str2 onto str1, adding a null termination.

So your source seems to be wrong. What was the result of your test to see which is correct?

Comments
thanks

>>The issue is that I cannot be certain that the length of str1 is enough to hold both str1 and str2.

And there is no way to do that in C language. C language lets the programmer do a lot of things that are potentially harmful to the program, and buffer overruns are just one of them. There is no way to determine how much memory was allocated to a pointer, or if any memory was allocated at all. There are several ways to crash strcat(), insufficient memory for destination string is just one of them. Another way is pass a string literal as the destination string, such as strcat("Hello","World"); is a guaranteed way to trash memory. And a couple little more subtle ways to trash memory

char* destination = "Hello";
strcat(destination,"World");
void foo(char* dest, char*source)
{
   strcat(dest,source);
}

int main()
{
   foo("Hello","World");
}

Edited 5 Years Ago by Ancient Dragon: n/a

Comments
thank you, I will remember your word when I get to pointers

strncpy(str1,str2,n) -- definition

I see what you mean now. It seems that my question was not clear. I'm not looking to calculate how long an array to assign to str1. I am assuming that str1 is already allocated and I'm trying to handle the general case of concatenating str2 to the end of str1 without overflowing str1.

So it is in fact n that I need to calculate, based on sizeof(str1), strlen(str1), and strlen(str2).

What would sizeof(str1) give you?

The amount of characters that can be assigned to str1.

What would strlen(str1) give you?

The actual amount of characters already assigned.

What's left?

How many I can assign: n! For some not-enough-coffee reason it can be seen in the code in the OP that I was doing sizeof(str1)-strlen(str2) instead. Why I do not know!

From http://www.cplusplus.com/reference/clibrary/cstring/strncat/:
Appends the first num characters of source to destination, plus a terminating null-character.

http://en.wikipedia.org/wiki/Strcat#strncat
The most common bounded variant, strncat, only appends a specified number of bytes, plus a NULL byte.

http://www.elook.org/programming/c/strncat.html
The function strncat() concatenates at most count characters of str2 onto str1, adding a null termination.

So your source seems to be wrong.

That is a partial answer. The function strncat does in fact add a null character to the end if str1 was not overflown. If str1 was in fact overflown, then the null character is not added.

What was the result of your test to see which is correct?

I'm still trying to see how I test for that. I'm playing around in gcc on Ubuntu and with cl in a Windows virtual machine. The behaviours that I am seeing are not consistent and I'm still learning the idiosyncrasies of each.

Edited 5 Years Ago by dotancohen: n/a

And there is no way to do that in C language. C language lets the programmer do a lot of things that are potentially harmful to the program, and buffer overruns are just one of them. There is no way to determine how much memory was allocated to a pointer, or if any memory was allocated at all.

Actually, I'm only dealing with character arrays at this point, though your warnings about working with pointers to them I will head when I get to that!

Surely there exist coding best practices to avoid buffer overflows. Exactly that is what I am trying to learn.

Edited 5 Years Ago by dotancohen: n/a

>>Surely there exist coding best practices to avoid buffer overflows.
C language will let you shoot yourself in the foot and has been the cause of untold hours of debugging. I suppose the best practice is to allocate all new memory for the destination buffer. This may introduce memory leaks if the calling function fails to deallocate the pointer.

char* foo(char* str1, char*str2)
{
   char* str3 = malloc( strlen(str1) + strlen(str2) + 1);
   strcpy(str3,str1);
   strcat(str3,str2);
   return str3;
}

sizeof(str1) will only give you the size of the character buffer if str1 is declared inside the same function. If st1 is declared anywhere else then sizeof(str1) only gives you the size of a pointer -- 4 with 32-bit compilers.

Edited 5 Years Ago by Ancient Dragon: n/a

C language will let you shoot yourself in the foot and has been the cause of untold hours of debugging.

This is quite the reason that I want to get a handle on this now, in the beginning!

I suppose the best practice is to allocate all new memory for the destination buffer. This may introduce memory leaks if the calling function fails to deallocate the pointer.

I do agree that is one way, if we are using pointers to character arrays. However, at this time I am working with the character arrays themselves, so I believe that to be unnecessary. I do need to learn to use pointers the arrays, and I will get to that, however in the scope of the field that I am learning now (character arrays) is there no safe way to concatenate two strings without introducing a security vulnerability?

That is a partial answer. The function strncat does in fact add a null character to the end if str1 was not overflown. If str1 was in fact overflown, then the null character is not added.

Again, you are wrong. The NULL is added, just not where you want it to be.

As I implied, write a test program:

#include <stdio.h> 
#include <string.h> 
int main() 
{ 
    int so, sl, n;
    char  str1a[20] = "abcdefghij";
    char  str1b[20] = "abcdefghijklmn";
    char  str1c[20] = "abcdefghijklmnopq";
    char  str22[20] = "1234567890";

    strncat(str1a, str22, 7);    printf("<%s>\n", str1a);
    strncat(str1b, str22, 7);    printf("<%s>\n", str1b);
    strncat(str1c, str22, 7);    printf("<%s>\n", str1c);

    printf("12345678901234567890\n\n");

    printf("<%s>\n", str1a);
    printf("<%s>\n", str1b);
    printf("<%s>\n", str1c);

    return 0;
}

Now why did the last 3 printf 's display what they did? Think about it... :icon_wink:

Edited 5 Years Ago by WaltP: n/a

Comments
Thanks!

Again, you are wrong. The NULL is added, just not where you want it to be.

Indeed, I suspected that the NULL would be added anyway, but after the end of the character array. That might mean in the memory of something else.

Now why did the last 3 printf 's display what they did? Think about it... :icon_wink:

Nice test. I moved the 12345678901234567890 over one character so that I could count characters (I assume this is the reason it was there anyway):

✈demios:code$ ./a.out 
<abcdefghij1234567>
<abcdefghijklmn1234567>
<abcdefghijklmnopq1234567>
 12345678901234567890

<7>
<4567>
<abcdefghijklmnopq1234567>
✈demios:code$

That first 7 is what is left after the end of str1b, and 4567 is what is left after the end of str1c. It looks like the memory had overflown to the next character array, but in the opposite direction that I had expected. As if str1a, str1b, and str1c are laid out in memory in reverse order, with the end of str1c coming right before the beginning of str1b, and the end of str1b coming right before the beginning of str1a. Would I see that result on other compilers as well? These results are on the Kubuntu box compiled with gcc. I'm fixing the Windows VM, so I cannot test in there at the moment.

Did I mention how much I appreciate you taking the time to teach me this stuff? I really do appreciate it. Thanks.

Indeed, I suspected that the NULL would be added anyway, but after the end of the character array. That might mean in the memory of something else.

Exactly

Nice test. I moved the 12345678901234567890 over one character so that I could count characters (I assume this is the reason it was there anyway):

Yes. That was my faux pas... (or in English- fough paugh)

That first 7 is what is left after the end of str1b, and 4567 is what is left after the end of str1c. It looks like the memory had overflown to the next character array, but in the opposite direction that I had expected. As if str1a, str1b, and str1c are laid out in memory in reverse order, with the end of str1c coming right before the beginning of str1b, and the end of str1b coming right before the beginning of str1a.

Excellent synopsis.

Would I see that result on other compilers as well? These results are on the Kubuntu box compiled with gcc. I'm fixing the Windows VM, so I cannot test in there at the moment.

Should be similar. Maybe a compiler does not reverse the order of the variables in memory, so the result will be different, but still wrong. Maybe the program crashes because you overwrite some memory that's protected. Lot's of different things can happen, none of them good.

Did I mention how much I appreciate you taking the time to teach me this stuff? I really do appreciate it. Thanks.

Welcome. It's actually a pleasure to help someone that thinks, then posts, rather than posting instead of thinking.

Next time you have a question like this "what will happen?", now you can write your own test program to find out. And don't rely on just an explanation (like you did with cprogramming.com), also check the descriptions based on the definition of the functions (like I did, a simple Google search) and make sure the explanation matches.

Edited 5 Years Ago by WaltP: n/a

Yes. That was my faux pax... (or in English- fough paugh)

Neither English nor French are my native tongue, is this what you mean:
http://en.wikipedia.org/wiki/Faux_pas

Maybe a compiler does not reverse the order of the variables in memory, so the result will be different, but still wrong. Maybe the program crashes because you overwrite some memory that's protected. Lot's of different things can happen, none of them good.

That's what I figured. I'm going to break lots of things testing this where I can :)

Next time you have a question like this "what will happen?", now you can write your own test program to find out. And don't rely on just an explanation (like you did with cprogramming.com), also check the descriptions based on the definition of the functions (like I did, a simple Google search) and make sure the explanation matches.

That's quite the problem: I did in fact try to write a test program but I did not know in which ways things would break, so I did not think to assign several character arrays and then to overflow from one to the other. I am confident, though, as my experience progresses I'll start to learn how to poke things to get the reactions that I can learn from.

Thanks!

By the way, regarding your .sig, I had once heard a similar definition of Congress. It goes something along the lines of Pro being the opposite of Con, therefore what would one call the opposite of Progress?

Neither English nor French are my native tongue, is this what you mean:
http://en.wikipedia.org/wiki/Faux_pas

Yes


That's what I figured. I'm going to break lots of things testing this where I can :)

Watch out, guys! I've created a Monster! :)

By the way, regarding your .sig, I had once heard a similar definition of Congress. It goes something along the lines of Pro being the opposite of Con, therefore what would one call the opposite of Progress?

I believe I had that as a sig years ago. It's another one of my favorites. The way I've used it is
"If pro is the opposite con, then what's the opposite of progress?"

Edited 5 Years Ago by WaltP: n/a

This article has been dead for over six months. Start a new discussion instead.