The other day, I got an idea for a compression program, and decided to write up a function that compresses a file into "filename.compressed".

Compression function works fine, but I get a nasty assertion failure at the return of the main() function, after the file's been compressed. Assertion failure is on line 1017 of dbgheap.c, the expression being asserted is _BLOCK_TYPE_IS_VALID_(pHead->nBlockUse).

Here's my code:

#include <stdio.h>
#include <malloc.h>
#include <string.h>
#include <deque>
using namespace std;
void addstrings(char **first,char *second) {
	int size = strlen(*first)+strlen(second);
	char *temp = new char[size+1];
	strcpy(temp,*first);
	strcpy(temp+strlen(*first),second);
	delete [] *first;
	*first=temp;
}

int compress(char *file,int level,int *additional=NULL) {
	int ret=0;
	FILE *thefile=fopen(file,"r");
	if (thefile) {
		int filesize=0;
		while (fgetc(thefile)!=EOF) filesize++;
		if (filesize%level) {
			ret=2;
			if (additional) *additional=filesize;
		} else {
			printf("Size of file: %d\n",filesize);
			rewind(thefile);
			char *temp = new char[filesize];
			fread(temp,filesize,1,thefile);
			int temp2;
			int temp3;
			deque<char*> table;
			char *tempstring=NULL;
			bool found=false;
			
			for (temp2=0;temp2<filesize;temp2+=level) {

				tempstring = new char[level+1];
				for (temp3=0;temp3<level;temp3++)
					tempstring[temp3]=temp[temp2+temp3];
				tempstring[temp3]=0;
				for (temp3=0;temp3<table.size();temp3++)
					if (strcmp(table[temp3],tempstring)==0) {
						found=true;
						break;
					}
				if (!found) table.push_back(tempstring);
			}
			char *tempfilestring = new char[strlen(file)+1];
			strcpy(tempfilestring,file);
			addstrings(&file,".compressed");
			fclose(thefile);
			thefile = fopen(file,"wb+");
			if (thefile) {
				fprintf(thefile,"%d\n",table.size());
				for (temp2=0;temp2<table.size();temp2++) 
					fprintf(thefile,"%d%s\n",temp2,table[temp2]);
				for (temp2=0;temp2<filesize;temp2+=level) {
					tempstring = new char[level+1];
					for (temp3=0;temp3<level;temp3++)
						tempstring[temp3]=temp[temp2+temp3];
					tempstring[temp3]=0;
					for (temp3=0;temp3<table.size();temp3++)
						if (strcmp(tempstring,table[temp3])==0) {
							fprintf(thefile,"%d",temp3);
							break;
						}
				}
				fclose(thefile);
			}
			else ret = 3;
			delete [] temp;
			for (temp2=0;temp2<table.size();temp2++)
				delete [] table[temp2];
			table.clear();
		}
	} else ret = 1;
	return ret;
}
int main(void) {
	char *buffer = new char[256];
	int compression = 0;
	int result=0;
	while (compression<256) buffer[compression++]=0;
	printf("What file?\n");
	fgets(buffer,255,stdin);
	buffer[strlen(buffer)-1]=0;
	printf("Compression level?\n");
	scanf("%d",&compression);
	switch (compress(buffer,compression,&result)) {//Get file and level of compression from above, and call compress function
	case 0: printf("File compressed.\n");break;
	case 1: printf("Couldn't open file.\n");break;
	case 2: printf("Invalid compression level. Level must divide evenly into file size. Filesize was %d\n",result);break;
	case 3: printf("Error creating new file to put compressed information in. File name was %s.compressed.",buffer);break;
	}
	delete [] buffer;
	return 0;
}

I started out learning C, so if the code's a bit C-like, it's because I'm stuck with my old C ways.

Note that I've yet to write a decompression routine, but the compression routine seems to compress a .txt with "hello" in it to a file 2 characters larger, although the "hello" portion seems to be garbled.

The compression is pretty simple. It loads the file into a spot in memory, checks for strings of length 'level', checks if that string is unique to all previous strings found, and if so, adds it in to a table. When it's done, it will have found all unique strings, and will then proceed to write them in the compressed file in this fashion:

0firststring
1secondstring
2thirdstring
And then in the file, will plot numbers corresponding to the string:

012012012

Will be this in the uncompressed file:
firststringsecondstringthirdstringfirststringsecondstringthirdstringfirststringsecondstringthirdstring

And I thought of it myself. :)

It has a few bugs in it, the 3 main ones for now being that it has trouble loading the file into memory, so it's compressing garbage as of now, it has an assertion failure at the end, and the level of compression must divide evenly into the file size.

So can anyone help?

(edit) Replaced first code with current code. Still gives an assertion failure after the return statement in main().

Recommended Answers

All 2 Replies

Hmm, I just noticed that I forgot to load the file into memory, and to rewind() the file after I found how big it was.

I fixed that, and now it seems to get the unique strings. All that's left is to fix some other stuff.

Still has an assertion failure, though.

(edit) Pretty much fixed everything in the compression code, except for the assertion failure. I haven't been able to find anything that causes that, and when I run my debugger on it, it happens after the return statement on main().

I haven't looked through your code - maybe you could post your latest version as you've obviously updated it with some fixes since you first posted it. The assertion failure you're seeing, this is often caused by accessing some memory that doesn't belong to you while the program is executing. This can show up in many different ways, sometimes as a crash or some other observable behaviour during execution, but sometimes it goes unnoticed until the program terminates, just as you describe.

If you can post your code it might be possible to spot something. Maybe check that you don't write outside array bounds anywhere or abuse your pointers.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.