main()
{
char *p="dcis";
while(*p++!='\0');
printf("%s",p);
}

Why this is printing %s in my unix box with gcc 4.6.1 ?

Recommended Answers

All 12 Replies

What were you hoping it would print?

If p is located at 0x1000, then "dcis" will be from 0x1000 to 0x1003, then the NULL terminator will be at 0x1004, and p will point to 0x1005, so it will point PAST the character array. "%s" is located at the next available memory, which would be 0x1005, which is what p now points to, so that's what it printed. I don't know if it's guaranteed to happen that way or if it's undefined behavior, but that's what your program did when I ran it.

It totally depends on how the compiler organizes memory. In gcc the string literal "%s" is stored after the string literal "dcis" in memory, so overrunning the string prints "%s". You can test it like this:

#include <stdio.h>

int main(void)
{
    const char *before = "before";
    const char *p = "dcis";
    const char *after = "after";
    
    while (*p++ != '\0');
        
    printf("%s", p);
    
    return 0;
}

In gcc it'll print "after", but another compiler might print "before", or even garbage. Overrunning a string like that is undefined behavior.

Perhaps this will help explain what the layout has to do with the output. If you compile the code to asm using gcc -S file.c they you might see the following output:

.file   "file.c"
    .section    .rodata
.LC0:
    .string "dcis"
.LC1:
    .string "%s"
    .text

Notice in the read-only section of the file ( .rodata ) the string %s is located directly after dcis . Since your pointer is just iterating sequentially through memory it naturally hits the next location.

Did you actually mean this? It produces quite different results:

main()
{
char *p="dcis";
while(*p++!='\0')
printf("%s",p);
}

I mean of these, just for preparing to crack interviews.

But.

main()
{
char *p="dcis";
while(*p++!='\0')
printf("%s\n",p);
}

what about this..?

I mean of these, just for preparing to crack interviews.

But.

main()
{
char *p="dcis";
while(*p++!='\0')
printf("%s\n",p);
}

what about this..?

That's fine and will print "cisiss". Now the printf() is limited by the loop and overrun doesn't happen.

Ok.
but sorry, I forgot to put ; at the end of while loop. (This is what I really meant)

main()
{
char *p="dcis";
while(*p++!='\0');
printf("%s\n",p);
}

then why this is not printing %s

Why would it print "%s" ? You're asking it to print the characters that p points to. Where does p point to? Somewhere off the end of the character array, into memory that isn't yours and that you never set. You never set those characters. They could be anything. They could be "%s". They could be "fnighirw". They could be "don't try to print characters from memory that isn't yours". The program could segFault and crash.

Please tell me that you do understand that your while loop

while(*p++!='\0');

contains no code, and all you are doing is changing where p points so that it points after the text "dcis". Just so that I can be sure you understand that.

yes, I understand. and I intentionally moving my pointer to the end of the string and checking the behavior.. but why it printed %s in the case of printf ("%s", p); but nothing in the case of printf ("%s\n", p); is my question. If we forget what I am assessing is in my address space or not.

Any how I understand that this behavior is undefined. and if there is any valied reason found (may be specifically in gcc) please share it.

It's just how gcc turns it into object code.

Here's some of the assembler from the printf ("%s\n", p); version:

.section	.rodata
.LC0:
	.string	"dcis"
	.text
.globl main
	.type	main, @function

and here is some from the printf ("%s", p); version.

.section	.rodata
.LC0:
	.string	"dcis"
.LC1:
	.string	"%s"
	.text
.globl main

It's clear that in one version, the text %s is right after the dcis.

It's how the gcc compiler builds it. As to why it builds it like that... you'd have to ask the compiler writers, I suppose :) When I open up the hex of the first version, the string %s doesn't appear in it at all, which is interesting.

I think I understand why -- printf("%s\n", s) can be optimized to puts(s).

That said, this is EXACTLY why you shouldn't ever rely on undefined behavior -- because it does weird things you can't predict and is affected by apparently unrelated changes in ways you can't anticipate. Any attempt to make sense of what happens after UB has been invoked is an exercise in futility.

commented: That is exactly what's happening in this case. Good skillz.. +9

Having poked a little deeper into the assember, Trentacle is exactly right - in the \n version, the call to printf is replaced with a call to puts.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.