Ok. As the title says I tried many times to simply write a program that rewrites a file to remove null characters. I confirmed with a hex editor that the file in question has tons on null characters, on average about 1 of every 2 characters in null. So my last attempt:

#include <stdlib.h>
#include <stdio.h>

int main()
{
    FILE *f = fopen("main2.c","r");
    FILE *t = fopen("temp","w");
    int c;
    int count = 0;
    while((c = fgetc(f))!=EOF)
    {
        if(c)
        {
            fputc(c,t);
        }
        else
        {
            printf("null found\n");
        }
    }
    fclose(f);
    fclose(t);
    FILE *n = fopen("main2.c","w");
    FILE *w = fopen("temp","r");
    while((c=fgetc(w))!=EOF)
    {
        fputc(c,n);
    }
    fclose(n);
    fclose(w);
}

I though if nothing else this should work perfectly. Instead it spits out a file with chinese characters.

Sorry for being so stupid but I'm feeling impatient right now. I guess I would like to know what I did wrong, an a correct example, maybe a much more concise and faster answer. Thanks.

Recommended Answers

All 10 Replies

line 27 - 30 can replaced by deleting the original file then renameing the temp file. It's not necessary to rewrite the data again.

As for your specific question, can't answer that unless I have a copy of the file you are trying to work with.

on average about 1 of every 2 characters

Sounds like it's a UNICODE file, not a text file.

line 27 - 30 can replaced by deleting the original file then renameing the temp file. It's not necessary to rewrite the data again.

As for your specific question, can't answer that unless I have a copy of the file you are trying to work with.

on average about 1 of every 2 characters

Sounds like it's a UNICODE file, not a text file.

Last time I tried to have 2 streams (read and write) to the same file all I got was an empty file as a result. The temp is supposed to be a work around. I also had trouble trying to store data in an array, so I didn't want to go that route.

My specific question is how do I write a C program that will edit a file to remove specific characters. The file I'm trying to edit is for the most part irrelevant to the issue. If I really needed to I could screw around with the hex editor or simply rewrite it so it wasn't an issue. This just seemed like a nice exercise that turned fustrating. Also I need to understand why the characters became corrupted in the first place.

You do indeed need two files, one for read and temp for writing as you have done in the first part of your program. What you don't need is to rewrite temp back into the original file like you did at the end of the program. Just delete the original file and rename temp to the name of the original.

ok a portion of the hex dump:

FF FE 23 00 69 00 6E 00 63 00 6C 00 75 00 64 00 65 00 20 00 3C 00 73 00 74 00 64 00 6C 00 69 00 62 00 2E 00 68 00 3E 00 0D 00 0A 00 23 00 69 00 6E 00 63 00 6C 00 75 00 64 00 65 00 20 00 3C 00 61 00 6C 00 6C 00 65 00 67 00 72 00 6F 00 2E 00 68 00 3E 00 0D 00 0A 00 23 00 69 00 6E 00 63 00 6C 00 75 00 64 00 65 00 20 00 22 00 6D 00 6F 00

ok a portion of the hex dump:

Just as I suspected, that is not a text file. The first three bytes indicate it is written in UNICODE format, where each character occupies two or more bytes. All you have to do is fread it b ack in as UNICODE strings.

did you try binary mode ?
http://www.cplusplus.com/reference/clibrary/cstdio/fread/
it should work very goot if you work with bytes.
for example - read whole file to buffer then run through all bytes and if byte is not 0 output to result buffer

The whole purpose of this was to reformat the text file so my gcc compiler wouldn't spit out a million warnings. Unless you can specify the text encoding somehow?

also, thread solved.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.