I'm trying to read in a text file character by character. However, when I try to read it in it using the readChar method in DataInputStream it gives me a IOException. Heres what I have so far

FileInputStream stream;
		try
		{
		    // Open an input stream
		    stream = new FileInputStream ("test");

		    // Read a line of text
		    DataInputStream input = new DataInputStream (stream);
		    try		//tries to read input file 
		    {
			    c=input.readChar();
			    s="char" + c;
		    }
		    catch (NullPointerException e)
		    {
		    	System.out.print("");
		    }
		    // Close our input stream
		    stream.close();	
		}
		// Catches any error conditions
		catch (IOException e)   //This is the Exception that gets thrown
		{
			System.err.println ("Unable to read from file");
			System.exit(-1);
		}
       System.out.println(s);

My text file "test" simply has "i" in it.

Hope someone can tell me why this exception is thrown. Do I have to do something differently from the readLine()?

Have you tried providing a .txt extension in the String name of test?

// Open an input stream
		    stream = new FileInputStream ("test.txt");

Have you tried providing a .txt extension in the String name of test?

// Open an input stream
		    stream = new FileInputStream ("test.txt");

Yeah. I can access the file fine because when I do readLine it works. Its just readChar that throws exception.

It might possibly have something to do with the fact that the readChar method returns the Unicode value of the character, not the actual character it is reading. I found this documented on

http://java.sun.com/j2se/1.4.2/docs/api/java/io/DataInput.html#readChar()

. I am assuming that c is of type char and s is a type string. This could possibly cause issues because you are adding the Unicode value, which is an int to the string. Hope this helps some.

It might possibly have something to do with the fact that the readChar method returns the Unicode value of the character, not the actual character it is reading. I found this documented on

http://java.sun.com/j2se/1.4.2/docs/api/java/io/DataInput.html#readChar()

. I am assuming that c is of type char and s is a type string. This could possibly cause issues because you are adding the Unicode value, which is an int to the string. Hope this helps some.

Yeah I definately think thats the case. But how can I get the character value out of the unicode value. I tried casting it but it doesn't change it. Also its curious because the while loop that controls whether or not to keep reading compares the character that was read in to 'd'. It works fine. In other words when it reads a 'd' it stops. You'd think it wouldn't since it was only the unicode value. Or maybe the '==' operator has a special case to compare the character and its unicode value.

Casting the int to a char should work for it. I dont know why it wouldnt in this case. To cast it, you should have something like....

Unicode value is stored in n.
char c is the character variable you want to store it in.

c = (char) n;

I think thats right.

I'm trying to read in a text file character by character. However, when I try to read it in it using the readChar method in DataInputStream it gives me a IOException. Heres what I have so far

FileInputStream stream;
		try
		{
		    // Open an input stream
		    stream = new FileInputStream ("test");

		    // Read a line of text
		    DataInputStream input = new DataInputStream (stream);
		    try		//tries to read input file 
		    {
			    c=input.readChar();
			    s="char" + c;
		    }
		    catch (NullPointerException e)
		    {
		    	System.out.print("");
		    }
		    // Close our input stream
		    stream.close();	
		}
		// Catches any error conditions
		catch (IOException e)   //This is the Exception that gets thrown
		{
			System.err.println ("Unable to read from file");
			System.exit(-1);
		}
       System.out.println(s);

My text file "test" simply has "i" in it.

Hope someone can tell me why this exception is thrown. Do I have to do something differently from the readLine()?

Note that IOException is a superclass of many classes (you can see a list at its documentation). If you put e.printStackTrace(); in your catch block you'll see much more detail about the exception being thrown rather than just your "Unable to read from file" message. In fact, you should see that an EOFException is being thrown. Why? Because, as Koldsoul said, DataInputStream#readChar reads in Unicode characters. If you look at that method's documention, you'll see that a Unicode character consists of two bytes (or two chars—a char is a byte ). Since you only have one character in your input file, there is not enough data for it to read, so it throws an EOFException (EOF stands for "end of file").

With that said, readChar is probably not the method you want, unless you encoded your file in the Unicode format (if you didn't specifically do this, it's probably UTF-8). Try putting more characters in your file and see what happens when you run it. You'll probably see Asian characters (if you didn't put them there, you're clearly trying to read the wrong encoding). If you want to continue using the DataInputStream class (though I would recommend the Scanner class instead) to read from your file, use the readByte method instead, and cast it to a char (remember, byte == char ).

I am assuming that c is of type char and s is a type string. This could possibly cause issues because you are adding the Unicode value, which is an int to the string. Hope this helps some.

Yeah I definately think thats the case. But how can I get the character value out of the unicode value. I tried casting it but it doesn't change it. Also its curious because the while loop that controls whether or not to keep reading compares the character that was read in to 'd'. It works fine. In other words when it reads a 'd' it stops. You'd think it wouldn't since it was only the unicode value. Or maybe the '==' operator has a special case to compare the character and its unicode value.

Unicode is merely a character encoding. If you read the readChar documentation again you'll see that, indeed, a char is returned, not an integer. If casting was the problem, the code would not have compiled.

Hope this helps!

Actually, if you read the documentation of DataInputStream's method readChar, you'll notice that it attempts to read not one, but 2 bytes--

Proof

-- but instead of throwing and EOF error you're getting a IOException, which seems confusing.

My guess is that there's only the character 'i' (1 byte of information) in DataInputStream and the EndOfFile Character is not returned and therefore an exception is thrown because there aren't 2 bytes to read, but only 1.

Edit: Yes, this seems to be the case because I ran the test with two i's next to each other instead of one, and the character ? was returned and no exceptions were thrown.

Actually, if you read the documentation of DataInputStream's method readChar, you'll notice that it attempts to read not one, but 2 bytes--

Proof

-- but instead of throwing and EOF error you're getting a IOException, which seems confusing.

My guess is that there's only the character 'i' (1 byte of information) in DataInputStream and the EndOfFile Character is not returned and therefore an exception is thrown because there aren't 2 bytes to read, but only 1.

Edit: Yes, this seems to be the case because I ran the test with two i's next to each other instead of one, and the character ? was returned and no exceptions were thrown.

Beat you to it. :)

And to unconfuse you, read my post above and you'll see that an EOFException is an IOException .

Comments
Yeah, I just saw your post XD

Yeah, but I forgot one thing.

IOException is a superclass of EOFException so its possible that the EOFException did occur.

EOFException

Yup, I edited my post a minute before you posted this.

Yup, I edited my post a minute before you posted this.

Hey, hey! I've already given you a +1 on rep!

Enough! >_<

=P

Note that IOException is a superclass of many classes (you can see a list at its documentation). If you put e.printStackTrace(); in your catch block you'll see much more detail about the exception being thrown rather than just your "Unable to read from file" message. In fact, you should see that an EOFException is being thrown. Why? Because, as Koldsoul said, DataInputStream#readChar reads in Unicode characters. If you look at that method's documention, you'll see that a Unicode character consists of two bytes (or two chars—a char is a byte ). Since you only have one character in your input file, there is not enough data for it to read, so it throws an EOFException (EOF stands for "end of file").

With that said, readChar is probably not the method you want, unless you encoded your file in the Unicode format (if you didn't specifically do this, it's probably UTF-8). Try putting more characters in your file and see what happens when you run it. You'll probably see Asian characters (if you didn't put them there, you're clearly trying to read the wrong encoding). If you want to continue using the DataInputStream class (though I would recommend the Scanner class instead) to read from your file, use the readByte method instead, and cast it to a char (remember, byte == char ).


Unicode is merely a character encoding. If you read the readChar documentation again you'll see that, indeed, a char is returned, not an integer. If casting was the problem, the code would not have compiled.

Hope this helps!

You sir are my hero!

Thank you very much to everyone else as well. Turns out I needed to use the readByte because the readChar only works if the text file was written using something that produces chars into unicode.

> I'm trying to read in a text file character by character. However, when I try to read it in it
> using the readChar method in DataInputStream it gives me a IOException. Heres what I
> have so far

You are using a wrong approach IMO. Even though you have found a solution which makes your problem *seem* to go away, it fails to address the actual issue: Why use DataInputStream ?

From the API docs:

The DataInput interface provides for reading bytes from a binary stream and reconstructing from them data in any of the Java primitive types.

Because of this, you end up encountering an EOFException even though your file contains valid character data. The approach of reading in bytes using DataInputStream would fail miserably for multi-lingual character data since you would have to convert from bytes to actual characters. Use a Reader for reading character data since you get to specify an encoding which you want to use [which luckily also takes care of reading in bytes and converting them to character data of the given encoding].

Also reading in data character by character is pretty inefficient even though your program logic demands character by character processing. Read in a sizable chunk of characters and process them in memory to avoid a lot of disk reads.

> if you didn't specifically do this, it's probably UTF-8

No it isn't; it's modified UTF-8.

> you're clearly trying to read the wrong encoding

You don't have an option, it has to be modified UTF-8. From the docs:

Data input streams and data output streams represent Unicode strings in a format that is a slight modification of UTF-8.

> If you read the readChar documentation again you'll see that, indeed,
> a char is returned, not an integer

Because it actually *is* intended for reading in a `char' primitive data type unlike other Streams which return an `int'.

> a char is a byte

No, it isn't.

> if you didn't specifically do this, it's probably UTF-8
No it isn't; it's modified UTF-8.

How do you know what format the OP saved his file in?

> you're clearly trying to read the wrong encoding
You don't have an option, it has to be modified UTF-8.

What? My point was that if you read in data from a file and get unexpected results (such as Asian characters, in this case) then you are trying to read in data with the wrong encoding.

> If you read the readChar documentation again you'll see that, indeed,
> a char is returned, not an integer
Because it actually *is* intended for reading in a `char' primitive data type unlike other Streams which return an `int'.

What's your point?

> a char is a byte
No, it isn't.

You're right. In C, a char is usually an 8-bit type which is what tripped me up.

> How do you know what format the OP saved his file in?

Because he is using DataInputStream. So either the file is in modified UTF-8 or the OP is doing it wrong.

> What? My point was that if you read in data from a file and get unexpected results
> (such as Asian characters, in this case) then you are trying to read in data with the
> wrong encoding.

Generally speaking, yes, but given the current scenario, it has to be modified UTF-8.

> What's your point?

My point being that readXXX method of DataInputStream will always returns XXX, so there should be no surprises that readChar returns a char and not an int like the read() method of other Streams.

> How do you know what format the OP saved his file in?

Because he is using DataInputStream. So either the file is in modified UTF-8 or the OP is doing it wrong.

Chances are he's doing it wrong.

> What? My point was that if you read in data from a file and get unexpected results
> (such as Asian characters, in this case) then you are trying to read in data with the
> wrong encoding.

Generally speaking, yes, but given the current scenario, it has to be modified UTF-8.

Yes, in this case it does, but that's irrelevant to my point.

> What's your point?

My point being that readXXX method of DataInputStream will always returns XXX, so there should be no surprises that readChar returns a char and not an int like the read() method of other Streams.

Right, but I was clarifying for the other posters since they thought that casting to a char would fix the problem.

This article has been dead for over six months. Start a new discussion instead.