Member Avatar for zegarnek

Hi,

I do not think this is a problem as such but an interesting behaviour to watch.

I have little piece of C (to do with trying later Unicode, Polish characters and text files).

I was trying to find a position of a Polish character in the alphabet string. Had some problems
so I devised this simple temp. count display.

See the comment line 35 starting: // observe the lines below. I have there simple 3xprintf() to print two
lines being like a ruler and the alphabet string under:

000000000111 and so on up to 7
123456789012 and so on up to 0 - so ruller to 70-th pos
AaĄąCcĆćDdEe and the rest of the Polish alphabet

So the lines must be one under the other. The first might be: printf("000000000111\n"); see the \n
then printf("12345678912\n"); the \n and printf("AaĄąCcĆćDdEe\n"0;

All was fine until I wanted to stop the execution just after the first printf(); to see
what was printed. I put the getchar(); there next after the ptintf() and press just ENTER
I am getting an empty line like this:

000000000111
_
123456789012
AaĄąCcĆćDdEe

ect.

The getchar() function (cursor) waits IN THE newline and the ruler is split with it like above.
SO for the purpose of testing the first printf() but still have the proper behaviour - newlines
behaving properly I have to remove the \n from the printf() function. Then getchar() waits
at the, just printed, line grabs the ENTER key (which acts like a \n in the printf() and prints the rest ok.
I just found it as an interesting a little confusing BUT NORMAL behaviour.

=====================

   /*
     This routine will operate within
    the Unicode area of characters especially
    the Polish 'ogonki' so it requires
    the use of wide characters headers
    and wide chars functions. At the moment
    */

    #include <wchar.h>
    #include <locale.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>


    int main()
    {
    FILE *fp_in, *fp_out;
        // the lines below are to create a kind of measuring ruler
        // this in standard string
    char *top_ndx = "0000000001111111111222222222233333333334444444444555555555566666666667\0";
    char *bot_ndx = "1234567890123456789012345678901234567890123456789012345678901234567890\0";

    int c;

        // as above but with wide char strings

    wchar_t *topw_ndx = L"0000000001111111111222222222233333333334444444444555555555566666666667\0";
    wchar_t *botw_ndx = L"1234567890123456789012345678901234567890123456789012345678901234567890\0";
    wchar_t *pl_alfa_wd = L"AaĄąCcĆćDdEeĘęFeGgHhIiJjKkLlłŁMmNnOoÓóPpQqRrSsŚśTtUuWwXxYyZzŹźŻż<#>.,\0";
    setlocale(LC_ALL, "pl_PL.utf8"); //Set localisation for utf8-polish

    // getchar();
    // observe the lines below printf() and getchar() 
    printf("STD CHARS - \n%s", top_ndx);
    getchar();
    printf("%s\n", bot_ndx);
    // getchar();
    printf("%ls\n", pl_alfa_wd);


    printf("\nNOW LONG STR CHARS LLL\n\n");
    printf("%ls\n", topw_ndx);
    printf("%ls\n", botw_ndx);
    printf("%ls\n", pl_alfa_wd);

    exit(0);
    }


=====================

I just found it as an interesting a little confusing BUT NORMAL behaviour.

Yes, it's normal. I can also see it being confusing because there are three issues involved:

  1. The console shell will likely be in a state that echoes any input.
  2. Standard input functions use whatever the shell state is. There's no standard way to take raw input or input with no echo to the exclusion of the shell state.
  3. Due to the way the shell works, input is line oriented and requires a newline to sent data from the shell to the running program.

The result is that if you print a newline, then take input that requires a newline to send to the program, that newline will be echoed and you'll have two adjacent newlines (ie. a blank line in the output).

Such is the way when your input source and output destination are one in the same.

Member Avatar for zegarnek

Hi,

thanks for respons. I thought it be usefull just in case someone like me (me_new_be) finds to be confused adn puzzled. Trying to fiddle with raw input is not really worth in this case of testing only.

I am playing with Unicode text files (trying to break an open door :-) ). It is another quite confusing and not that easy subject in 'C' (maybe c++ is easier). Until now I failed to find a way to identify the character set from 'simple' text file - I did find that there are no longer simple text files.
ps.
Like your singature ...

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.