I have a generic text file, which can be composed of text data and numerical values, like an article. It can have multiple paragraphs/lines and have various types of delimiters. My general purpose is to tokenize this text file into a string array. I am quite confusing on how to handle this in Java, should I use “scanner” or “Bufferedreader”?

6 Years
Discussion Span
Last Post by JamesCherrill

Scanner is useful when you know what kind of data to expect before you read it. From your description it sounds like you should use a BufferedReader, read one complete line at a time, then parse it for delimiters etc yourself.


In addition to BufferedReader, you also have the option of using read(byte[]) function. It is relatively faster than BufferedReader.


Do you have any authoritative source for that claim?
Do you want to consider the difference between a character reading stream and byte-oriented reads in terms of localisation and non-english character sets?
Do you want to explain how to do character-oriented parsing on a byte array?

