How about you can use BufferOutputStream? You could build a byte array at whatever size you want. Then, use BufferOutputStream to write out each time. For example, your byte array size is 4kb, you would write out 4kb each time instead of 1 character at a time.
Taywin
Posting Virtuoso
1,727 posts since Apr 2010
Reputation Points: 229
Solved Threads: 239
I just look at their API, you could read in and write out using a char array. Adapting with your way, you could read in as a char array with a certain size, then encrypt all the char in the array, and then write the whole array out to the file. Then rinse the array and repeat. Would that be better for you to reduce the HD access?
Taywin
Posting Virtuoso
1,727 posts since Apr 2010
Reputation Points: 229
Solved Threads: 239
My main concern is that like this, the harddisk is constantly being accessed for every single character - I'm sure it's highly inefficient.
Yes, it sure is.Please note that when I used 'readLine()' instead of 'read()', i got the java heap error, so im guessing i have to read a character at a time.
Something's fishy here. The default buffer size of Buffered streams is AFAIK 8KB. Plus if you are looping over the input stream reading the bytes, encrypting them and writing them to a file, garbage collection should ensure that the bytes previously read are collected before throwing an OOME. Are you sure you are not keeping references to previously read data?Actually it has to use the BufferedReader/Writer strictly.
Again fishy. The encryption algorithm doesn't know/shouldn't know the kind of content it is encrypting and hence it should have been BufferInputStream and BufferedOutputStream instead of its Reader/Writer counterparts unless you are using some sort of special encryption algorithm which operates strictly on textual data. :-)
~s.o.s~
Failure as a human
11,938 posts since Jun 2006
Reputation Points: 3,281
Solved Threads: 734
Yes it's purely textual data - just for an assignment ;) with the current code it takes around 8 minutes to encrypt a 600mb text file using a simple substitution cipher. What do you think, is it decent?
Nope; because for a simple substitution cipher like Caesar cipher, I can encrypt a 600 MB file in 40 seconds. :-)
Are you still reading the file character by character since that would explain your timings?
I think you are getting OutOfMemoryException when using readLine() because your sample 600MB file does not contain any newline and hence the Reader tries to read the entire file content in a single String. The most effective solution here would be to use FileReader/FileWriter and implement your own buffering (32KB buffer would be a good one).
~s.o.s~
Failure as a human
11,938 posts since Jun 2006
Reputation Points: 3,281
Solved Threads: 734
Is that a single character being read? If yes, that method is painfully slow. Like already mentioned, if readLine() throws OOME, it is possible that your entire file contains a single line. In that case, just use the read() method to read a specific number of characters rather than an entire line. 8Kb char buffer would be a good start.
~s.o.s~
Failure as a human
11,938 posts since Jun 2006
Reputation Points: 3,281
Solved Threads: 734
A sample snippet:
private void process(final Reader reader, final Writer writer) {
try {
final char[] cbuf = new char[8 * 1024];
int len = -1;
while((len = reader.read(cbuf)) != -1) {
// translate is your method which takes a string and translates/encrypts it
writer.write(translate(new String(cbuf, 0, len)));
}
} catch(final Exception e) {
throw new RuntimeException(e);
}
}
Given that the buffering is done by the method, you need not even use a Buffered reader/writer; a File reader/writer should suffice.
~s.o.s~
Failure as a human
11,938 posts since Jun 2006
Reputation Points: 3,281
Solved Threads: 734
I'm not sure what your issue here is because the snippet I posted would work *out of the box* without any modifications as far as the reading and writing part is concerned. I'd recommend reading the Javadocs for the read() method and writing small snippets to understand how it actually works.
~s.o.s~
Failure as a human
11,938 posts since Jun 2006
Reputation Points: 3,281
Solved Threads: 734
I'm not sure why you use the word "flush".
It goes like this: you create a char array (which initially contains all '\0' characters) and pass the array to the read method. This method fills up the "char" array with the characters read and returns the "number of characters" (n) read. You then utilize from the same char array `n' characters which have just being read. Rinse and repeat with the same array; the next read() call simply overwrites the old data; there is no flushing. Simple, no? :-)
~s.o.s~
Failure as a human
11,938 posts since Jun 2006
Reputation Points: 3,281
Solved Threads: 734
how we can make sure those old characters are not there
You don't need to; read my previous post again. read() returns an int which represents the number of characters read. So even when doing your last read if your buffer isn't full, it really doesn't matter since you know "which portion" of the array contains the newly read values. If you'll look at the original snippet which I posted, I use a String constructor which creates a String object based on the "valid slice" of the array using this same return value of read() method.
~s.o.s~
Failure as a human
11,938 posts since Jun 2006
Reputation Points: 3,281
Solved Threads: 734
Are you talking about re-using the character array for something else? If not, you've got me all lost there; post some sample code/pseudocode as to what that *something else* is and what you are doing right now.
~s.o.s~
Failure as a human
11,938 posts since Jun 2006
Reputation Points: 3,281
Solved Threads: 734