Hi, im writing a piece of code that reads an HTML source code file. The code is meant to read the HTML and pick out stock symbols. It seems to work fine but as soon as it finds the first stock ticker, the program stops. Ideally, it should find all the stock tickers (There can be upto 25 symbols in the html code).
What am i doing wrong? I would appreciate any help. My C# code is posted below:

using System;
using System.Collections;
using System.ComponentModel;
using System.Data;
using System.Web;
using System.IO;
using System.Net;
using System.Text.RegularExpressions;
using System.Text;
using System.Diagnostics;
namespace pinksheets2
{
class Program
{
static void Main(string[] args)
{
System.IO.StreamReader file = new System.IO.StreamReader("c:\\dlist.txt");
// dump streamreader contents to string
string strContent = file.ReadToEnd();

Console.WriteLine(strContent);
// search string
string startStr = "../quote/quote.jsp?symbol=";
// end string
String endStr = ">";
int startIndex;
int endIndex;
string record = null;

String str = null;
while ((record = strContent) != null)
{
startIndex = record.IndexOf(startStr);
if (startIndex != -1)
{
endIndex = record.IndexOf(endStr, startIndex);
 
if (endIndex != -1)
{
int length = endIndex - startIndex;
str = record.Substring(startIndex, length);
Console.WriteLine(str);
string words2 = str;
string[] split = str.Split(new Char[] { ' ', ',', '.', ':', '=', '>' });
Console.WriteLine("symbol is:" + split[4]);
Console.Read();


}
}
}
 
 
}
}
}

I can post the html code too if needed.

TIA

Recommended Answers

All 5 Replies

Console.Read();

It's waiting for the user to press the ENTER key here, so it stops

It's waiting for the user to press the ENTER key here, so it stops

thanks for the comment. i didnt realize that.

even if i dont use console.read() , the real problem is that the code just goes back to the start of the html source code whereas i want it to continue reading the html code from where it left off.

It wouldn't hurt if you posted the HTML as well.

here's the part of the HTML code im interested in

<TR BGCOLOR="#ffffff" BORDERCOLOR="#FDEFF9">
<TD NOWRAP>A.B. WATLEY GROUP, INC.</td>
<TD NOWRAP>PS</td>
<TD> Com ($0.001)</td>
<TD NOWRAP><a href=../quote/quote.jsp?symbol=ABWG>ABWG</a></td>

</tr>

<TR BGCOLOR="#FDEFF9" BORDERCOLOR="#FDEFF9">
<TD NOWRAP>A.G. MEDIA GROUP, INC.</td>
<TD NOWRAP>PS</td>
<TD> Com ($0.001)</td>
<TD NOWRAP><a href=../quote/quote.jsp?symbol=AMGJ>AMGJ</a></td>

</tr>

<TR BGCOLOR="#ffffff" BORDERCOLOR="#FDEFF9">
<TD NOWRAP>A21, INC.</td>
<TD NOWRAP>PS/OTC BB</td>
<TD> Com ($0.001)</td>
<TD NOWRAP><a href=../quote/quote.jsp?symbol=ATWO>ATWO</a></td>

</tr>

<TR BGCOLOR="#FDEFF9" BORDERCOLOR="#FDEFF9">
<TD NOWRAP>AAA ENERGY, INC.</td>
<TD NOWRAP>PS/OTC BB</td>
<TD> Com ($0.001)</td>
<TD NOWRAP><a href=../quote/quote.jsp?symbol=AAAE>AAAE</a></td>

I would do this via regex, here's some example:

static void Main(string[] args)
		{
			const string theRegexString = "quote\\.jsp\\?symbol=([^=>]+)\\s*>";
			Regex regex = new Regex(theRegexString, RegexOptions.Compiled);
			
			using (StreamReader inFile = new StreamReader(Environment.GetFolderPath(Environment.SpecialFolder.Desktop) + @"\test.txt"))
			{
				string contents = inFile.ReadToEnd();

				MatchCollection matches = regex.Matches(contents);
				foreach (Match match in matches)
				{
					Console.WriteLine(match.Groups[1]);
				}
			}

			Console.ReadKey();
		}
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.