1.11M Members

StreamReader and Position

 
0
 

I'm revisiting a topic previously discussed in this thread: http://www.daniweb.com/techtalkforums/thread23030.html

The basic issue is this: when using a StreamReader, and ReadLine, to process a text file, you cannot determine where you are "at" within a file.

This is because StreamReader doesn't actually read from a FILE, it reads from a BUFFER. You can know the actual file position, by looking at the .BaseStream.Position property. However, if you want to know the position, in the file, of the record you just read with StreamReader.ReadLine(), you cannot know.

There has to be an elegant solution to this. I haven't found it.

Background:

I'm processing extremely large (4GB+) text files. They are actually PostScript printstreams. I have a number of operations to perform on these files. In fact, I need to perform randon file i/o. As an example, the printstream might mix invoices and credit memos. I need to extract all credit memos to a separate file. Imagine they look exactly alike, only the string "CREDIT MEMO" appears in the middle of page 1 of a credit memo.

I know when a "document" begins. I know when one ends. I know when a document I'm currently "reading" is a credit memo. I need to be able to REPOSITION the stream back to the starting record of the document, and extract until I reach the end of the document.

Specifically, I need to note the byte-position of a particular record, so that I can BaseStream.Seek() back to it.

I'm using a StreamReader because I am indeed reading a text file, and I do need the speed that buffering supplies. However, the buffer prevents me from know exactly where a particular record is within the file.

One idea is to add each record's Length to a counter. The problem is line-termination characters. Does the file use 1 or 2 bytes? How can you know?

Any other ideas?

 
0
 

C'mon... someone out there must appreciate a challenge. What was suggested on another forum was to derive my own class from StreamReader, and then override the ReadLine() method to count actual bytes read, including any line termination characters. Then, expose a public property that returns the actual bytes read.

I've done quite a bit of C# coding, but haven't attempted to derive any classes from a base class. Anyone care to lend a hand with this?

The source code for the StreamReader class can be viewed here:

http://www.123aspx.com/rotor/rotorsrc.aspx?rot=42055

I've managed to get this far:

using System;

namespace streamOR
{
	/// <summary>
	/// Summary description for StreamReader2.
	/// </summary>
	public class StreamReader2 : System.IO.StreamReader
	{
		public StreamReader2(String path) : base(path) 
		{
		}
	}
}

That derives a new class from StreamReader. It also defines a constructor, overriding the particular base constructor I wish to use.

Next step is to override ReadLine(). This is where I need help. The StreamReader.ReadLine() method uses a lot of private members, so I cannot simply cut-and-paste the base code and expect it work. What is the proper technique for overriding a method?

Here is the ReadLine() method from the StreamReader class. How would I override ReadLine() to maintain all this functionality, and yet add my own code?

public override String ReadLine() {
            if (stream == null)
__Error.ReaderClosed();

            if (charPos == charLen) {
                if (ReadBuffer() == 0) return null;
}
StringBuilder sb = null;
            do {
int i = charPos;
                do {
char ch = charBuffer[i];
                    // Note the following common line feed chars:
                    // \n - UNIX   \r\n - DOS   \r - Mac                     if (ch == '\r' || ch == '\n') {
String s;
                        if (sb != null) {
sb.Append(charBuffer, charPos, i - charPos);
s = sb.ToString();
}
                        else {
s = new String(charBuffer, charPos, i - charPos);
}
charPos = i + 1;
                        if (ch == '\r' && (charPos < charLen || ReadBuffer() > 0)) {
                            if (charBuffer[charPos] == '\n') charPos++;
}
                        return s;
}
i++;
} while (i < charLen);
i = charLen - charPos;
                if (sb == null) sb = new StringBuilder(i + 80);
sb.Append(charBuffer, charPos, i);
} while (ReadBuffer() > 0);
            return sb.ToString();
}
 
0
 

Hi, what about a derived class like this one (it's very basic, but it might be what you are searching for):

public class MyStream : StreamReader
		{
			private uint _pos;

			public MyStream(string path, System.Text.Encoding enc) : base(path, enc)
			{
			}

			public override string ReadLine()
			{
				char current;	
				int i;
				string line = null;

				while ((i = base.Read()) != -1)
				{
					current = (char)i;
					_pos++;

					if (IsFeed(current))
					{ 
						if ((i = base.Peek()) != -1)
						{
							if (!IsFeed((char)i))
								break;
						}
					}
					else line += current;
				}

				return line;
			}

			private bool IsFeed(char c)
			{
				if (c == '\r' || c == '\n')
					return true;

				return false;
			}
			
			public uint Pos
			{
				get
				{
					return _pos;
				}
			}
		}
 
0
 

The problem with deriving a class from StreamReader itself, is that you don't have access to the private members. ReadLine() leans heavily on private members, such as ReadBuffer().

What I ended up doing, successfully, is to derive a class from TextReader, just as StreamReader itself does. This means, in effect, I created my own StreamReader class, by copying the StreamReader code entirely, rather than deriving from it.

Then, I created a private member named _bytesRead, and exposed it as a public variable, BytesRead. I modifed ReadLine() to keep a running total of the bytes read, including the line-termination characters.

Now I have a "myStreamReader" class with all the functionality of a regular StreamReader, plus a BytesRead property suitable for passing to .BaseStream.Seek().

The "gotchas" involve code that the native StreamReader can access from the System.IO namespace that I cannot. This was mainly the error-event code, which I had to replace with my own. Another was Buffer.InternalBlockCopy(), which was easily enough replaced with Buffer.BlockCopy().

An invaluable resource if you ever find yourself in a similar situation is
http://www.123aspx.com/rotor/.

 
0
 

Here's my solution, for any interested. This is basically a hacked version of the .NET StreamReader code. It adds two public properties, LineLength and BytesRead. LineLength returns the actual length of the current record, inclusive of line-termination characters. BytesRead returns the actual real bytes read (regardless of the buffer mechanism). This value is suitable for passing into .BaseStream.Seek().

It could be better. The real StreamReader code uses some methods internal to some of the core .NET namespaces, so I did the best I could with those. If anyone would like to post improvements, they are welcome to do so.

using System;
using System.Text;
using System.Runtime.InteropServices;
using System.IO;

namespace TGREER
{
	[Serializable()]
	public class myStreamReader : TextReader
	{
		public new static readonly myStreamReader Null = new NullmyStreamReader();

		internal const int DefaultBufferSize = 1024;  // Byte buffer size
		private const int DefaultFileStreamBufferSize = 4096;
		private const int MinBufferSize = 128;
		private Stream stream;
		private Encoding encoding;
		private Decoder decoder;
		private byte[] byteBuffer;
		private char[] charBuffer;
		private byte[] _preamble;
		private int charPos;
		private int charLen;
		private int byteLen;
		private int _maxCharsPerBuffer;
		private bool _detectEncoding;
		private bool _checkPreamble;
		private bool _isBlocked;

 		private int _lineLength;
		public int LineLength
		{
			get { return _lineLength; }
		}

		private int _bytesRead;
		public int BytesRead
		{
			get { return _bytesRead; }
		} 

		internal myStreamReader()
		{
		}

		public myStreamReader(Stream stream)
			: this(stream, Encoding.UTF8, true, DefaultBufferSize)
		{
		}

		public myStreamReader(Stream stream, bool detectEncodingFromByteOrderMarks)
			: this(stream, Encoding.UTF8, detectEncodingFromByteOrderMarks, DefaultBufferSize)
		{
		}

		public myStreamReader(Stream stream, Encoding encoding)
			: this(stream, encoding, true, DefaultBufferSize)
		{
		}

		public myStreamReader(Stream stream, Encoding encoding, bool detectEncodingFromByteOrderMarks)
			: this(stream, encoding, detectEncodingFromByteOrderMarks, DefaultBufferSize)
		{
		}

		public myStreamReader(Stream stream, Encoding encoding, bool detectEncodingFromByteOrderMarks, int bufferSize)
		{
			if (stream == null || encoding == null)
				throw new ArgumentNullException((stream == null ? "stream" : "encoding"));
			if (!stream.CanRead)
				throw new ArgumentException(Environment.GetEnvironmentVariable("Argument_StreamNotReadable"));
			if (bufferSize <= 0)
				throw new ArgumentOutOfRangeException("bufferSize", Environment.GetEnvironmentVariable("ArgumentOutOfRange_NeedPosNum"));

			Init(stream, encoding, detectEncodingFromByteOrderMarks, bufferSize);
		}

		/// <include file='doc\myStreamReader.uex' path='docs/doc[@for="myStreamReader.myStreamReader4"]/*' />
		public myStreamReader(String path)
			: this(path, Encoding.UTF8, true, DefaultBufferSize)
		{
		}

		/// <include file='doc\myStreamReader.uex' path='docs/doc[@for="myStreamReader.myStreamReader9"]/*' />
		public myStreamReader(String path, bool detectEncodingFromByteOrderMarks)
			: this(path, Encoding.UTF8, detectEncodingFromByteOrderMarks, DefaultBufferSize)
		{
		}

		/// <include file='doc\myStreamReader.uex' path='docs/doc[@for="myStreamReader.myStreamReader5"]/*' />
		public myStreamReader(String path, Encoding encoding)
			: this(path, encoding, true, DefaultBufferSize)
		{
		}

		/// <include file='doc\myStreamReader.uex' path='docs/doc[@for="myStreamReader.myStreamReader6"]/*' />
		public myStreamReader(String path, Encoding encoding, bool detectEncodingFromByteOrderMarks)
			: this(path, encoding, detectEncodingFromByteOrderMarks, DefaultBufferSize)
		{
		}

		/// <include file='doc\myStreamReader.uex' path='docs/doc[@for="myStreamReader.myStreamReader7"]/*' />
		public myStreamReader(String path, Encoding encoding, bool detectEncodingFromByteOrderMarks, int bufferSize)
		{
			// Don't open a Stream before checking for invalid arguments,
			// or we'll create a FileStream on disk and we won't close it until
			// the finalizer runs, causing problems for applications.
			if (path == null || encoding == null)
				throw new ArgumentNullException((path == null ? "path" : "encoding"));
			if (path.Length == 0)
				throw new ArgumentException(Environment.GetEnvironmentVariable("Argument_EmptyPath"));
			if (bufferSize <= 0)
				throw new ArgumentOutOfRangeException("bufferSize", Environment.GetEnvironmentVariable("ArgumentOutOfRange_NeedPosNum"));

			Stream stream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read, DefaultFileStreamBufferSize);
			Init(stream, encoding, detectEncodingFromByteOrderMarks, bufferSize);
		}

		private void Init(Stream stream, Encoding encoding, bool detectEncodingFromByteOrderMarks, int bufferSize)
		{
			this.stream = stream;
			this.encoding = encoding;
			decoder = encoding.GetDecoder();
			if (bufferSize < MinBufferSize) bufferSize = MinBufferSize;
			byteBuffer = new byte[bufferSize];
			_maxCharsPerBuffer = encoding.GetMaxCharCount(bufferSize);
			charBuffer = new char[_maxCharsPerBuffer];
			byteLen = 0;
			_detectEncoding = detectEncodingFromByteOrderMarks;
			_preamble = encoding.GetPreamble();
			_checkPreamble = (_preamble.Length > 0);
			_isBlocked = false;
		}

		/// <include file='doc\myStreamReader.uex' path='docs/doc[@for="myStreamReader.Close"]/*' />
		public override void Close()
		{
			Dispose(true);
		}

		/// <include file='doc\myStreamReader.uex' path='docs/doc[@for="myStreamReader.Dispose"]/*' />
		protected override void Dispose(bool disposing)
		{
			if (disposing)
			{
				if (stream != null)
					stream.Close();
			}
			if (stream != null)
			{
				stream = null;
				encoding = null;
				decoder = null;
				byteBuffer = null;
				charBuffer = null;
				charPos = 0;
				charLen = 0;
			}
			base.Dispose(disposing);
		}

		/// <include file='doc\myStreamReader.uex' path='docs/doc[@for="myStreamReader.CurrentEncoding"]/*' />
		public virtual Encoding CurrentEncoding
		{
			get { return encoding; }
		}

		/// <include file='doc\myStreamReader.uex' path='docs/doc[@for="myStreamReader.BaseStream"]/*' />
		public virtual Stream BaseStream
		{
			get { return stream; }
		}

		// DiscardBufferedData tells myStreamReader to throw away its internal
		// buffer contents.  This is useful if the user needs to seek on the
		// underlying stream to a known location then wants the myStreamReader
		// to start reading from this new point.  This method should be called
		// very sparingly, if ever, since it can lead to very poor performance.
		// However, it may be the only way of handling some scenarios where 
		// users need to re-read the contents of a myStreamReader a second time.
		/// <include file='doc\myStreamReader.uex' path='docs/doc[@for="myStreamReader.DiscardBufferedData"]/*' />
		public void DiscardBufferedData()
		{
			byteLen = 0;
			charLen = 0;
			charPos = 0;
			decoder = encoding.GetDecoder();
			_isBlocked = false;
		}

		/// <include file='doc\myStreamReader.uex' path='docs/doc[@for="myStreamReader.Peek"]/*' />
		public override int Peek()
		{
			//if (stream == null)
			//__Error.ReaderClosed();

			if (charPos == charLen)
			{
				if (_isBlocked || ReadBuffer() == 0) return -1;
			}
			return charBuffer[charPos];
		}

		public override int Read()
		{
			//if (stream == null)
			//__Error.ReaderClosed();

			if (charPos == charLen)
			{
				if (ReadBuffer() == 0) return -1;
			}
			return charBuffer[charPos++];
		}

		public override int Read([In, Out] char[] buffer, int index, int count)
		{
			//if (stream == null)
			//__Error.ReaderClosed();
			if (buffer == null)
				throw new ArgumentNullException("buffer", Environment.GetEnvironmentVariable("ArgumentNull_Buffer"));
			if (index < 0 || count < 0)
				throw new ArgumentOutOfRangeException((index < 0 ? "index" : "count"), Environment.GetEnvironmentVariable("ArgumentOutOfRange_NeedNonNegNum"));
			if (buffer.Length - index < count)
				throw new ArgumentException(Environment.GetEnvironmentVariable("Argument_InvalidOffLen"));

			int charsRead = 0;
			// As a perf optimization, if we had exactly one buffer's worth of 
			// data read in, let's try writing directly to the user's buffer.
			bool readToUserBuffer = false;
			while (count > 0)
			{
				int n = charLen - charPos;
				if (n == 0) n = ReadBuffer(buffer, index + charsRead, count, out readToUserBuffer);
				if (n == 0) break;  // We're at EOF
				if (n > count) n = count;
				if (!readToUserBuffer)
				{
					Buffer.BlockCopy(charBuffer, charPos * 2, buffer, (index + charsRead) * 2, n * 2);
					charPos += n;
				}
				charsRead += n;
				count -= n;
				// This function shouldn't block for an indefinite amount of time,
				// or reading from a network stream won't work right.  If we got
				// fewer bytes than we requested, then we want to break right here.
				if (_isBlocked)
					break;
			}
			return charsRead;
		}

		/// <include file='doc\myStreamReader.uex' path='docs/doc[@for="myStreamReader.ReadToEnd"]/*' />
		public override String ReadToEnd()
		{
			//if (stream == null)
			//__Error.ReaderClosed();

			// For performance, call Read(char[], int, int) with a buffer
			// as big as the myStreamReader's internal buffer, to get the 
			// readToUserBuffer optimization.
			char[] chars = new char[charBuffer.Length];
			int len;
			StringBuilder sb = new StringBuilder(charBuffer.Length);
			while ((len = Read(chars, 0, chars.Length)) != 0)
			{
				sb.Append(chars, 0, len);
			}
			return sb.ToString();
		}

		// Trims n bytes from the front of the buffer.
		private void CompressBuffer(int n)
		{
			Buffer.BlockCopy(byteBuffer, n, byteBuffer, 0, byteLen - n);
			byteLen -= n;
		}

		// returns whether the first array starts with the second array.
		private static bool BytesMatch(byte[] buffer, byte[] compareTo)
		{
			for (int i = 0; i < compareTo.Length; i++)
				if (buffer[i] != compareTo[i])
					return false;
			return true;
		}

		private void DetectEncoding()
		{
			if (byteLen < 2)
				return;
			_detectEncoding = false;
			bool changedEncoding = false;
			if (byteBuffer[0] == 0xFE && byteBuffer[1] == 0xFF)
			{
				// Big Endian Unicode
				encoding = new UnicodeEncoding(true, true);
				decoder = encoding.GetDecoder();
				CompressBuffer(2);
				changedEncoding = true;
			}
			else if (byteBuffer[0] == 0xFF && byteBuffer[1] == 0xFE)
			{
				// Little Endian Unicode
				encoding = new UnicodeEncoding(false, true);
				decoder = encoding.GetDecoder();
				CompressBuffer(2);
				changedEncoding = true;
			}
			else if (byteLen >= 3 && byteBuffer[0] == 0xEF && byteBuffer[1] == 0xBB && byteBuffer[2] == 0xBF)
			{
				// UTF-8
				encoding = Encoding.UTF8;
				decoder = encoding.GetDecoder();
				CompressBuffer(3);
				changedEncoding = true;
			}
			else if (byteLen == 2)
				_detectEncoding = true;
			// Note: in the future, if we change this algorithm significantly,
			// we can support checking for the preamble of the given encoding.

			if (changedEncoding)
			{
				_maxCharsPerBuffer = encoding.GetMaxCharCount(byteBuffer.Length);
				charBuffer = new char[_maxCharsPerBuffer];
			}
		}


		private int ReadBuffer()
		{
			charLen = 0;
			byteLen = 0;
			charPos = 0;
			do
			{
				byteLen = stream.Read(byteBuffer, 0, byteBuffer.Length);

				if (byteLen == 0)  // We're at EOF
					return charLen;

				// _isBlocked == whether we read fewer bytes than we asked for.
				// Note we must check it here because CompressBuffer or 
				// DetectEncoding will screw with byteLen.
				_isBlocked = (byteLen < byteBuffer.Length);

				if (_checkPreamble && byteLen >= _preamble.Length)
				{
					_checkPreamble = false;
					if (BytesMatch(byteBuffer, _preamble))
					{
						_detectEncoding = false;
						CompressBuffer(_preamble.Length);
					}
				}

				// If we're supposed to detect the encoding and haven't done so yet,
				// do it.  Note this may need to be called more than once.
				if (_detectEncoding && byteLen >= 2)
					DetectEncoding();

				charLen += decoder.GetChars(byteBuffer, 0, byteLen, charBuffer, charLen);
			} while (charLen == 0);
			//Console.WriteLine("ReadBuffer called.  chars: "+charLen);
			return charLen;
		}


		// This version has a perf optimization to decode data DIRECTLY into the 
		// user's buffer, bypassing StreamWriter's own buffer.
		// This gives a > 20% perf improvement for our encodings across the board,
		// but only when asking for at least the number of characters that one
		// buffer's worth of bytes could produce.
		// This optimization, if run, will break SwitchEncoding, so we must not do 
		// this on the first call to ReadBuffer.  
		private int ReadBuffer(char[] userBuffer, int userOffset, int desiredChars, out bool readToUserBuffer)
		{
			charLen = 0;
			byteLen = 0;
			charPos = 0;
			int charsRead = 0;

			// As a perf optimization, we can decode characters DIRECTLY into a
			// user's char[].  We absolutely must not write more characters 
			// into the user's buffer than they asked for.  Calculating 
			// encoding.GetMaxCharCount(byteLen) each time is potentially very 
			// expensive - instead, cache the number of chars a full buffer's 
			// worth of data may produce.  Yes, this makes the perf optimization 
			// less aggressive, in that all reads that asked for fewer than AND 
			// returned fewer than _maxCharsPerBuffer chars won't get the user 
			// buffer optimization.  This affects reads where the end of the
			// Stream comes in the middle somewhere, and when you ask for 
			// fewer chars than than your buffer could produce.
			readToUserBuffer = desiredChars >= _maxCharsPerBuffer;

			do
			{
				byteLen = stream.Read(byteBuffer, 0, byteBuffer.Length);

				if (byteLen == 0)  // EOF
					return charsRead;

				// _isBlocked == whether we read fewer bytes than we asked for.
				// Note we must check it here because CompressBuffer or 
				// DetectEncoding will screw with byteLen.
				_isBlocked = (byteLen < byteBuffer.Length);

				// On the first call to ReadBuffer, if we're supposed to detect the encoding, do it.
				if (_detectEncoding && byteLen >= 2)
				{
					DetectEncoding();
					// DetectEncoding changes some buffer state.  Recompute this.
					readToUserBuffer = desiredChars >= _maxCharsPerBuffer;
				}

				if (_checkPreamble && byteLen >= _preamble.Length)
				{
					_checkPreamble = false;
					if (BytesMatch(byteBuffer, _preamble))
					{
						_detectEncoding = false;
						CompressBuffer(_preamble.Length);
						// CompressBuffer changes some buffer state.  Recompute this.
						readToUserBuffer = desiredChars >= _maxCharsPerBuffer;
					}
				}

				/*
				if (readToUserBuffer)
					Console.Write('.');
				else {
					Console.WriteLine("Desired chars is wrong.  byteBuffer.length: "+byteBuffer.Length+"  max chars is: "+encoding.GetMaxCharCount(byteLen)+"  desired: "+desiredChars);
				}
				*/

				charPos = 0;
				if (readToUserBuffer)
				{
					charsRead += decoder.GetChars(byteBuffer, 0, byteLen, userBuffer, userOffset + charsRead);
					charLen = 0;  // myStreamReader's buffer is empty.
				}
				else
				{
					charsRead = decoder.GetChars(byteBuffer, 0, byteLen, charBuffer, charsRead);
					charLen += charsRead;  // Number of chars in myStreamReader's buffer.
				}
			} while (charsRead == 0);
			//Console.WriteLine("ReadBuffer: charsRead: "+charsRead+"  readToUserBuffer: "+readToUserBuffer);
			return charsRead;
		}


		// Reads a line. A line is defined as a sequence of characters followed by
		// a carriage return ('\r'), a line feed ('\n'), or a carriage return
		// immediately followed by a line feed. The resulting string does not
		// contain the terminating carriage return and/or line feed. The returned
		// value is null if the end of the input stream has been reached.
		//
		/// <include file='doc\myStreamReader.uex' path='docs/doc[@for="myStreamReader.ReadLine"]/*' />
		public override String ReadLine()
		{
			_lineLength = 0;
			//if (stream == null)
			//	__Error.ReaderClosed();
			if (charPos == charLen)
			{
				if (ReadBuffer() == 0) return null;
			}
			StringBuilder sb = null;
			do
			{
				int i = charPos;
				do
				{
					char ch = charBuffer[i];
					int EolChars = 0;
					if (ch == '\r' || ch == '\n')
					{
						EolChars = 1;
						String s;
						if (sb != null)
						{
							sb.Append(charBuffer, charPos, i - charPos);
							s = sb.ToString();
						}
						else
						{
							s = new String(charBuffer, charPos, i - charPos);
						}
						charPos = i + 1;
						if (ch == '\r' && (charPos < charLen || ReadBuffer() > 0))
						{
							if (charBuffer[charPos] == '\n')
							{
								charPos++;
								EolChars = 2;
							}
						}
 						_lineLength = s.Length + EolChars;
						_bytesRead = _bytesRead + _lineLength;
						return s;
					}
					i++;
				} while (i < charLen);
				i = charLen - charPos;
				if (sb == null) sb = new StringBuilder(i + 80);
				sb.Append(charBuffer, charPos, i);
			} while (ReadBuffer() > 0);
			string ss = sb.ToString();
 			_lineLength = ss.Length;
			_bytesRead = _bytesRead + _lineLength;
			return ss;
		}

		// No data, class doesn't need to be serializable.
		// Note this class is threadsafe.
		private class NullmyStreamReader : myStreamReader
		{
			public override Stream BaseStream
			{
				get { return Stream.Null; }
			}

			public override Encoding CurrentEncoding
			{
				get { return Encoding.Unicode; }
			}

			public override int Peek()
			{
				return -1;
			}

			public override int Read()
			{
				return -1;
			}

			/// <include file='doc\myStreamReader.uex' path='docs/doc[@for="myStreamReader.NullmyStreamReader.Read"]/*' />
			public override int Read(char[] buffer, int index, int count)
			{
				return 0;
			}

			/// <include file='doc\myStreamReader.uex' path='docs/doc[@for="myStreamReader.NullmyStreamReader.ReadLine"]/*' />
			public override String ReadLine()
			{
				return null;
			}

			public override String ReadToEnd()
			{
				return String.Empty;
			}
		}
	}
}
 
0
 

I havent had time to go over this in detail, just a quick scan, but you didnt need to copy the code from streamreader did you?
You could have implemented a shadow readline and added your own code, or you could overload readline with a parameter and have your code there whilst maintaining the other readline.
Personally i would have gone for the shadow method so you keep the same signature. Then in there you call the base readline then do your count and then return the information the base gave you

 
0
 

sorry... that will teach me to explain something to someone and try to answer this at the same time (i was explaining shadowing in vb vs new in C# to them so had it on the brain). I meant to say i would override not overload the readline, adding your bytecount there.

 
0
 

You'll have to explain to me the differences between override and overload. When I tried to derive from StreamReader, and provide my own ReadLine(), I ran into problems, because the core ReadLine() users private member methods that I couldn't overload/access.

 
0
 

overload means you have an extra method of the same name but the signature differs by parameter type

so you can have a method public void mymethod()
and overload it with public void mymethod(string param1)
and more ... public void mymethod(int param1)

but it is only by param type not param name so you cant have
public void mymethod(string param2) as another overload. However you can overload as follows
public void mymethod(int param1, string param2)
public void mymethod(string param3, int param4) as the signature has changed

return types do not overload so changing that wont work. The advantage of overloading is the compiler knows which method to call by the signature without you worrying about it.
you can also put common tasks in one eg
public void mymethod()
{
mymethod(true)
}
public void mymethod(bool someflag)
{
common code stuff here ...
}

overriding is different. If you have a base class with
public void mymethod()

then you can have your derived class override it
public override void mymethod()
{
}
you method gets called. Now you might want the base mymethod so you can call that
at anytime in your derived mymethod. The only time you cant override is if the base is marked sealed
You can even override and overload (even if the base has no overloads)

in your case i would think you are only interested in the length of the infomation returned
so in your derived class you would implement fields to store this (as you did in your derived class earlier) and override the readline() method.
in your overridden method you would call the base readline() and then use the info it returns to make additions to whatever you need. Do you really need to access the private members to do what you need to do? (I havent looked at your code so cannot tell you - sorry for being lazy but not had time today).
I am sure you would find a more elegant solution with the override.

 
0
 

ps if you are worried about the eol chars being one or two you can test for them in your derived class and make adjustments accordingly.

 
0
 

Ok, I understood all that, and really already knew it... but just wanted to double-check.

The reason I can't just override ReadLine(), is because ReadLine() uses internal, private, methods. To do what I need, I would have to re-write ReadLine(). Ok, no problem. But to do that, means I've have to write my own ReadBuffer(), etc.

I think I already travelled down the roads your pointing at, met with dead-ends, and so ended up where I'm at.

And where I'm at works perfectly for my needs... but take a closer look, and if there's a way to derive from StreamReader, with overridden methods, I'd like to see it.

 
0
 

if i get time i will have a look more closely :)
Anyone know how to get 26hrs out of a day yet?

 
0
 

Sure, if you're willing to experiment with personal quasiperiodicty.

 
0
 

I was having a similar problem and noticed there's a very useful StreamReader field called charPos which appears to be what you want. It gives you the position within the buffer. It shows up in the debug window but unfortunately it's private. The only way I could get actually get the value was to use reflection:

Int32 GetCharpos(StreamReader s)
{


Int32 charpos= (Int32) s.GetType().InvokeMember("charPos",
BindingFlags.DeclaredOnly |
BindingFlags.Public | BindingFlags.NonPublic |
BindingFlags.Instance | BindingFlags.GetField
,null, s, null);
Int32 charlen= (Int32) s.GetType().InvokeMember("charLen",
BindingFlags.DeclaredOnly |
BindingFlags.Public | BindingFlags.NonPublic |
BindingFlags.Instance | BindingFlags.GetField
,null, s, null);
return (Int32)s.BaseStream.Position-charlen+charpos;
}

Of course, there may be a very good reason why this field is private, but it appears to work for me. So far. I'm not sure if this can be abused to set the position too.

-Matt

 
0
 

I just wanted to give a quick thanks!

I have implemented this in VB.NET and it made a HUGE differance over the only other solution I had found:

Dim TR As IO.TextReader = System.IO.File.OpenText(file)
Dim MyFileLine As String = Split(TR.ReadToEnd(), vbCrLf)(lineNumber - 1)

Here is a link to the VB.NET post with code:
http://www.daniweb.com/techtalkforums/post214443.html#post214443

 
0
 

Thanks for your insight!

I was having similair problems, and your adapted version of the StreamReader provided me with enough information to solve it.

Please note that if you want to use Seek() in the StreamReader you must change the value of ReadBytes accordingly ! Otherwise it contains a wrong value after the Seek() operation.


Joep Grooten

 
0
 

Hi, what about a derived class like this one (it's very basic, but it might be what you are searching for):

public class MyStream : StreamReader
		{
			private uint _pos;

			public MyStream(string path, System.Text.Encoding enc) : base(path, enc)
			{
			}

			public override string ReadLine()
			{
				char current;	
				int i;
				string line = null;

				while ((i = base.Read()) != -1)
				{
					current = (char)i;
					_pos++;

					if (IsFeed(current))
					{ 
						if ((i = base.Peek()) != -1)
						{
							if (!IsFeed((char)i))
								break;
						}
					}
					else line += current;
				}

				return line;
			}

			private bool IsFeed(char c)
			{
				if (c == '\r' || c == '\n')
					return true;

				return false;
			}
			
			public uint Pos
			{
				get
				{
					return _pos;
				}
			}
		}

Thank u for ur this code. I'm using in my product.

THanks & regards,
CRT.

 
0
 

Hi this is my solution. I tried to use all of presented, but in each I found some not useful for me sides or bags. I don't wont to say that my solution is useful, I just modified your suggestions.

/// <summary>
    /// Class can be used only for text files.
    /// Theoretically class can process any encoding, but was checked only for ANSI and UTF8
    /// If you want to use this class with any other encoding, please check it.
    /// </summary>   
    public class PositionableStreamReader : StreamReader
    {        
        private long _position;

        public PositionableStreamReader(string fileName, Encoding enc ):base(fileName, enc)
        {
            //we need to add length of symbol which is in begin of file and describes encoding of file                                    
            if(IsPreamble())
            {
                _position = this.CurrentEncoding.GetPreamble().Length;
            }
        }


        /// <summary>
        /// Encoding can really haven't preamble
        /// </summary>        
        public bool IsPreamble()
        {
            byte[] preamble = this.CurrentEncoding.GetPreamble();
            bool res = true;
            for(int i = 0; i < preamble.Length; i++)
            {
                int dd =  base.BaseStream.ReadByte();
                if (preamble[i] != dd)
                {
                    res = false;
                    break;
                }
            }
            Position = 0;
            return res;
        }

        /// <summary>
        /// Use this property for get and set real position in file.
        /// Position in BaseStream can be not right.
        /// </summary>
        public long Position
        {
            get { return _position; }
            set 
            {
                ((Stream)base.BaseStream).Seek(value, SeekOrigin.Begin);                
                this.DiscardBufferedData();
                
            }
        }
        
        public override string ReadLine()
        {
            string line = base.ReadLine();
            if (line != null)
            {
                _position += CurrentEncoding.GetByteCount(line);
            }                        
            _position += CurrentEncoding.GetByteCount(Environment.NewLine);            
            return line;            
        }
    }
 
0
 

Borrowing from Matt's reflection code, here are some extension methods that get the job done:

using System.IO;
    using System.Reflection;

    /// <summary>
    /// Contains extension methods for this namespace.
    /// </summary>
    public static class ExtensionMethods
    {
        /// <summary>
        /// Gets the current read position of the StreamReader.
        /// </summary>
        /// <param name="streamReader">The StreamReader object to get the position for.</param>
        /// <returns>Current read position in the StreamReader.</returns>
        public static int GetPosition(this StreamReader streamReader)
        {
            // Based on code shared on www.daniweb.com by user mfm24(Matt).
            int charpos = (int) streamReader.GetType().InvokeMember(
                "charPos", 
                BindingFlags.DeclaredOnly | BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance | BindingFlags.GetField,
                null,
                streamReader,
                null);	
            int charlen= (int) streamReader.GetType().InvokeMember(
                "charLen", 
                BindingFlags.DeclaredOnly | BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance | BindingFlags.GetField,
                null,
                streamReader,
                null);
            return (int)streamReader.BaseStream.Position - charlen + charpos;
        }

        /// <summary>
        /// Sets the current read position of the StreamReader.
        /// </summary>
        /// <param name="streamReader">The StreamReader object to get the position for.</param>
        /// <param name="position">The position to move to in the file, starting from the beginning.</param>
        public static void SetPosition(this StreamReader streamReader, long position)
        {
            streamReader.BaseStream.Seek(position, SeekOrigin.Begin);
            streamReader.DiscardBufferedData();
        }
    }
 
0
 

Borrowing from Matt's reflection code, here are some extension methods that get the job done:

using System.IO;
    using System.Reflection;

    /// <summary>
    /// Contains extension methods for this namespace.
    /// </summary>
    public static class ExtensionMethods
    {
        /// <summary>
        /// Gets the current read position of the StreamReader.
        /// </summary>
        /// <param name="streamReader">The StreamReader object to get the position for.</param>
        /// <returns>Current read position in the StreamReader.</returns>
        public static int GetPosition(this StreamReader streamReader)
        {
            // Based on code shared on www.daniweb.com by user mfm24(Matt).
            int charpos = (int) streamReader.GetType().InvokeMember(
                "charPos", 
                BindingFlags.DeclaredOnly | BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance | BindingFlags.GetField,
                null,
                streamReader,
                null);	
            int charlen= (int) streamReader.GetType().InvokeMember(
                "charLen", 
                BindingFlags.DeclaredOnly | BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance | BindingFlags.GetField,
                null,
                streamReader,
                null);
            return (int)streamReader.BaseStream.Position - charlen + charpos;
        }

        /// <summary>
        /// Sets the current read position of the StreamReader.
        /// </summary>
        /// <param name="streamReader">The StreamReader object to get the position for.</param>
        /// <param name="position">The position to move to in the file, starting from the beginning.</param>
        public static void SetPosition(this StreamReader streamReader, long position)
        {
            streamReader.BaseStream.Seek(position, SeekOrigin.Begin);
            streamReader.DiscardBufferedData();
        }
    }

Unfortunately, when I try to use this code, it all builds but raises an exception at run-time:

System.FieldAccessException was unhandled
Message=System.IO.StreamReader.charPos
StackTrace:
at System.RuntimeType.InternalInvokeMember(Type thisType, String name, BindingFlags invokeAttr, Binder binder, Object target, Object[] args, ParameterModifier[] modifiers, CultureInfo culture, String[] namedParameters, StackCrawlMark& stackMark)
at System.Type.InvokeMember(String name, BindingFlags invokeAttr, Binder binder, Object target, Object[] args)
at ExtensionMethods.GetPosition(StreamReader streamReader)

Is there something I've done wrong to call it, or does it mean that in my environment (Windows Phone development), this method won't work?

Regards

Philip

Isn't it about time forums rewarded their contributors?

Earn rewards points for helping others. Gain kudos. Cash out. Get better answers yourself.

It's as simple as contributing editorial or replying to discussions labeled or OP Kudos

You
This is an OP Kudos discussion and contributors may be rewarded
Post:
Start New Discussion
View similar articles that have also been tagged: