| | |
Need sample code for Text File Compression!!!
Please support our C++ advertiser: Intel Parallel Studio Home
Thread Solved |
•
•
Join Date: Nov 2008
Posts: 4
Reputation:
Solved Threads: 0
Need some help on text file compression....
Text File Compression.
The text file compression must be able to perform the following functions:
• Able to compress a text file and generate an index file
• Able to decompress the file
• Able to show the time taken for compressing and decompressing
• Able to show the file size before and after compressed
Task:
1) Replace repeated sentence with index number
2) Decompress into original text file
3 )Use class
4) Replace repeated words with index number
5) Use any 2 concepts of OOD (i.e. inheritance, polymorphism)
6) Comment methods and variables appropriately
7) Generate Index file that store index number and correspondent word/sentence
Text File Compression.
The text file compression must be able to perform the following functions:
• Able to compress a text file and generate an index file
• Able to decompress the file
• Able to show the time taken for compressing and decompressing
• Able to show the file size before and after compressed
Task:
1) Replace repeated sentence with index number
2) Decompress into original text file
3 )Use class
4) Replace repeated words with index number
5) Use any 2 concepts of OOD (i.e. inheritance, polymorphism)
6) Comment methods and variables appropriately
7) Generate Index file that store index number and correspondent word/sentence
Last edited by SlayerX; Nov 26th, 2008 at 4:24 am.
911: http://www.daniweb.com/forums/announcement8-2.html
Sample code needed? Google is your friend!..
Sample code needed? Google is your friend!..
You know, I've been wondering about this for awhile now myself.
Honestly you can probably get away with making some kind of regex or key to "compress" files with given values.
For example lets say you have a text document and it is set up something like this--
-- what is listed is 6 s's, then 6 space characters, then a newline character and then 2 s's. After those s's, the end-of-file character is present (not in the above example, but pretend it is there for the sake of this example @_@ ).
How would you compress this? Right now the amount of space this file takes up on a disk is between (6 * 1 + 6 * 1 + 6 * 2 + (1 or 0)) or 24-25 bytes to 4096 bytes (4 kb) due to the way expansions work on some disk drives. If you want to compress this file to take up less space, you can form a special key that detects multiple values of a particular character and places them in an index.
For example, there are 6 s's. You can make a .compress file such that it has a pair of numbers (like 1 and 6) such that they are read from a parser that you will create that takes an index file of the same name for a particular .compress file and maps the numbers to their corresponding characters in the index file.
Your index file might look something like this--
--and if you have a smart parser, you can do repetition-detection for a particular character such that you don't need to index the same character more than once. Your index file would then look like--
--the job of your compress file is simply to have a number (or a key) match with a value and for a given position of the cursor the appropriate amount of values will be placed in appending order from the cursor's location.
The only down-side to this technique is that if you have too many individual data values in your file (that are non-recurring or not consecutive), you may face an expansion instead of a compression when you try to compress the file.
This is just a theory, but hopefully it helps.
-Alex
Honestly you can probably get away with making some kind of regex or key to "compress" files with given values.
For example lets say you have a text document and it is set up something like this--
// myWord.txt ssssss ss
-- what is listed is 6 s's, then 6 space characters, then a newline character and then 2 s's. After those s's, the end-of-file character is present (not in the above example, but pretend it is there for the sake of this example @_@ ).
How would you compress this? Right now the amount of space this file takes up on a disk is between (6 * 1 + 6 * 1 + 6 * 2 + (1 or 0)) or 24-25 bytes to 4096 bytes (4 kb) due to the way expansions work on some disk drives. If you want to compress this file to take up less space, you can form a special key that detects multiple values of a particular character and places them in an index.
For example, there are 6 s's. You can make a .compress file such that it has a pair of numbers (like 1 and 6) such that they are read from a parser that you will create that takes an index file of the same name for a particular .compress file and maps the numbers to their corresponding characters in the index file.
Your index file might look something like this--
s//[s char] // [newline char] //[space char]s//[s char]
--and if you have a smart parser, you can do repetition-detection for a particular character such that you don't need to index the same character more than once. Your index file would then look like--
s//[s char] // [newline char] //[space char]
--the job of your compress file is simply to have a number (or a key) match with a value and for a given position of the cursor the appropriate amount of values will be placed in appending order from the cursor's location.
The only down-side to this technique is that if you have too many individual data values in your file (that are non-recurring or not consecutive), you may face an expansion instead of a compression when you try to compress the file.
This is just a theory, but hopefully it helps.
-Alex
In the case the you only had to use ASCII values 32-126, you then could share a single byte for two values. You could even use a truth table that points out the positions of a nibble that is really its opposite extended version 128-255.
"Jedenfalls bin ich überzeugt, daß der Alte nicht würfelt."
"I became very sensitive to what will happen to all this and all of us." -Two geniuses named Albert
"I became very sensitive to what will happen to all this and all of us." -Two geniuses named Albert
•
•
•
•
oh..thx guys.....but i still cant do it....need some beginning codes...as u know im still beginner....well..hope u can help me thx!!
@everyone else
Hmm this makes interesting reading thanks
Chris
Knowledge is power -- But experience is everything
Tell me, can you even write a program to copy a file one character at a time?
Or write a program which can say count the number of 'a' characters in a file?
Or even produce a histogram showing the distribution of letters in a file?
See, having done this, you're half-way there to fulfilling one of your program requirements.
Because the same old plz help, gimme the codez refrain isn't doing you any favours at all.
a) we're not going to give you the complete answer
b) we're not going to give you much help until YOU post some code. When you post code, we know where to start with the explanations.
Or write a program which can say count the number of 'a' characters in a file?
Or even produce a histogram showing the distribution of letters in a file?
See, having done this, you're half-way there to fulfilling one of your program requirements.
Because the same old plz help, gimme the codez refrain isn't doing you any favours at all.
a) we're not going to give you the complete answer
b) we're not going to give you much help until YOU post some code. When you post code, we know where to start with the explanations.
![]() |
Similar Threads
- An OpenSource Database (C++)
Other Threads in the C++ Forum
- Previous Thread: const_reverse_iterator vs vec.rend()
- Next Thread: compile problem for template class with copy ctor and cast to class of base template
| Thread Tools | Search this Thread |
api array based binary c++ c/c++ calculator char char* class classes code coding compile console conversion count database delete deploy desktop developer directshow dll download dynamic dynamiccharacterarray email encryption error file forms fstream function functions game givemetehcodez google graph gui homeworkhelp iamthwee ifstream input int integer java lib linkedlist linker linux list loop looping loops map math matrix memory multiple news number numbertoword output parameter pointer problem program programming project python random read recursion recursive reference return rpg sorting string strings struct temperature template templates test text text-file tree unix url variable vector video visualstudio win32 windows winsock wordfrequency wxwidgets






