Reading in a text file -- efficiently
I had a problem w/ efficiency - The problem was that using String concatenation was inefficient for large text files, when reading the whole file into the String. I've since thought about it, and I guess this is because Strings are immutable - so every time concatenation is done, a new String object has to be created and the underlying char array has to be copied? After some research, I came across StringBuilder and String___ (I forget the rest of the name of the class). My question is, how can a text file be read in as efficiently as possible? I don't really care what type its read into. Also, a lot of sources I found said StringBuilder was the best way, is that true?
BestJewSinceJC
Posting Maven
2,772 posts since Sep 2008
Reputation Points: 874
Solved Threads: 354
I think StringBuffer might be the other String__ class you were referring to. Both the StringBuilder and StringBuffer class are better at concatenation than the base String class. StringBuilder will out-perform StringBuffer (although both are highly efficient at concatenation) but note that StringBuilder does not guarantee synchronisation, meaning that you should only use it if it is being accessed by a single Thread. Multi-Threaded applications need to use StringBuffer, or control the synchronisation themselves.
darkagn
Veteran Poster
1,197 posts since Aug 2007
Reputation Points: 404
Solved Threads: 200
Thanks. The app I am currently working on is a single thread so I'll just use whichever looks easier, I guess.
BestJewSinceJC
Posting Maven
2,772 posts since Sep 2008
Reputation Points: 874
Solved Threads: 354
Terms like efficiency make little sense when used in an absolute context. For finding the most efficient way to solve a problem you first need to track down your applications' usage pattern i.e. what kind of file reading does your application need? Is the data read promptly consumed or is stored for future use? Profiling your application is by far the best approach to selecting an efficient solution.
BTW, StringBuilder & StringBuffer are mutable companion classes to the immutable class String ; one being non-thread safe and the other being thread safe respectively.
~s.o.s~
Failure as a human
11,938 posts since Jun 2006
Reputation Points: 3,281
Solved Threads: 734
Well, if you're simply going to be reading the entire file into a single String without any kind of work on the lines read until after the entire file is read, I wouldn't use either of those, and wouldn't use a Reader at all. I would do it like this
File f = new File("filename");
FileInputStream fis = new FileInputStream(f);
byte[] b = new byte[f.length];
int read = 0;
while (read < b.length) {
read += fis.read(b, read, b.length - read);
}
String text = new String(b);
This is, of course, missing all error handling. That's for you. This also assumes that encoding won't be problem.
You are notforced to use a Reader simply because what you're reading is a text file.
masijade
Industrious Poster
4,253 posts since Feb 2006
Reputation Points: 1,471
Solved Threads: 494
> This also assumes that encoding won't be problem.
If this is what he actually wants, encoding shouldn't be a problem as long as an appropriate character set is specified when creating the string.
String text = new String(bytes, "UTF-8"); .
> byte[] b = new byte[f.length]; .
Array bites? ;-) byte[] bytes = new byte[f.length()];
~s.o.s~
Failure as a human
11,938 posts since Jun 2006
Reputation Points: 3,281
Solved Threads: 734
Yeah, thats great - thanks guys. Leave the error handling to me. Lol. :)
BestJewSinceJC
Posting Maven
2,772 posts since Sep 2008
Reputation Points: 874
Solved Threads: 354
> byte[] b = new byte[f.length]; .
Array bites? ;-) byte[] bytes = new byte[f.length()];
:icon_redface: :)
masijade
Industrious Poster
4,253 posts since Feb 2006
Reputation Points: 1,471
Solved Threads: 494