Need some help on text file compression....
Text File Compression.

The text file compression must be able to perform the following functions:

• Able to compress a text file and generate an index file
• Able to decompress the file
• Able to show the time taken for compressing and decompressing
• Able to show the file size before and after compressed


Task:
1) Replace repeated sentence with index number
2) Decompress into original text file
3 )Use class
4) Replace repeated words with index number
5) Use any 2 concepts of OOD (i.e. inheritance, polymorphism)
6) Comment methods and variables appropriately
7) Generate Index file that store index number and correspondent word/sentence

Recommended Answers

All 9 Replies

can giv me a sample?thx or help me with it .....i will learn it myself or u can write comment for me to let me learn it in the code....

You know, I've been wondering about this for awhile now myself.

Honestly you can probably get away with making some kind of regex or key to "compress" files with given values.

For example lets say you have a text document and it is set up something like this--

// myWord.txt

ssssss      
ss

-- what is listed is 6 s's, then 6 space characters, then a newline character and then 2 s's. After those s's, the end-of-file character is present (not in the above example, but pretend it is there for the sake of this example @_@ ).

How would you compress this? Right now the amount of space this file takes up on a disk is between (6 * 1 + 6 * 1 + 6 * 2 + (1 or 0)) or 24-25 bytes to 4096 bytes (4 kb) due to the way expansions work on some disk drives. If you want to compress this file to take up less space, you can form a special key that detects multiple values of a particular character and places them in an index.

For example, there are 6 s's. You can make a .compress file such that it has a pair of numbers (like 1 and 6) such that they are read from a parser that you will create that takes an index file of the same name for a particular .compress file and maps the numbers to their corresponding characters in the index file.

Your index file might look something like this--

s//[s char]
// [newline char] //[space char]s//[s char]

--and if you have a smart parser, you can do repetition-detection for a particular character such that you don't need to index the same character more than once. Your index file would then look like--

s//[s char]
// [newline char] //[space char]

--the job of your compress file is simply to have a number (or a key) match with a value and for a given position of the cursor the appropriate amount of values will be placed in appending order from the cursor's location.

The only down-side to this technique is that if you have too many individual data values in your file (that are non-recurring or not consecutive), you may face an expansion instead of a compression when you try to compress the file.

This is just a theory, but hopefully it helps.

-Alex

In the case the you only had to use ASCII values 32-126, you then could share a single byte for two values. You could even use a truth table that points out the positions of a nibble that is really its opposite extended version 128-255.

oh..thx guys.....but i still cant do it....need some beginning codes...as u know im still beginner....well..hope u can help me thx!!

commented: Stop begging and start coding -2

oh..thx guys.....but i still cant do it....need some beginning codes...as u know im still beginner....well..hope u can help me thx!!

You are NOT going to get a code solution. Either give in trying or post what you think you should be doing in words, pseudo-code or some method of displaying what you think you have to do. Then we can help you from there.

@everyone else
Hmm this makes interesting reading thanks

Chris

If i know how to do...do u think i will post here??...jus giv me a starting point......i really dont know how to do only post here...

Tell me, can you even write a program to copy a file one character at a time?

Or write a program which can say count the number of 'a' characters in a file?

Or even produce a histogram showing the distribution of letters in a file?
See, having done this, you're half-way there to fulfilling one of your program requirements.

Because the same old plz help, gimme the codez refrain isn't doing you any favours at all.

a) we're not going to give you the complete answer
b) we're not going to give you much help until YOU post some code. When you post code, we know where to start with the explanations.

commented: well said! +1
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.