DaniWeb IT Discussion Community

DaniWeb IT Discussion Community (http://www.daniweb.com/forums/index.php)
-   VB.NET (http://www.daniweb.com/forums/forum58.html)
-   -   Screen Scrape remove spaces/line breaks between specified tags (http://www.daniweb.com/forums/thread141861.html)

webfort Aug 23rd, 2008 5:22 am
Screen Scrape remove spaces/line breaks between specified tags
 
Hi,

I'm doing a screen scrape of a web page, which works with out any problems

What I want to do is replace the contents of tag, I can do this if the tag match exactly but in this page there are allot of blank spaces.

lbltest.Text contains the page being scrapped. The tag is formatted like this

<li class="thisclass">
                         
                            TheText
                         
            </li>

I can't to a simple replace because of all the spaces. So I need to get it to look like this
<li class="thisclass">TheText</li>

Any ideas how I might do this?

Thanks in advance

selvaganapathy Aug 24th, 2008 4:48 am
Re: Screen Scrape remove spaces/line breaks between specified tags
 
Hi,
you specify, what method you are using to Scrape the Page . Have you heard Regex class?

webfort Aug 24th, 2008 6:21 am
Re: Screen Scrape remove spaces/line breaks between specified tags
 
Hi,

This is the method I used:-
http://www.dotnetjohn.com/articles.aspx?articleid=93

Not heard of that class

selvaganapathy Aug 24th, 2008 1:30 pm
Re: Screen Scrape remove spaces/line breaks between specified tags
 
Regex is a Class that used for Regular Expressions. It is useful for Parsing.

For more detail, refer http://www.regular-expressions.info/dotnet.html

webfort Aug 24th, 2008 3:29 pm
Re: Screen Scrape remove spaces/line breaks between specified tags
 
Quote:

Originally Posted by selvaganapathy (Post 677020)
Regex is a Class that used for Regular Expressions. It is useful for Parsing.

For more detail, refer http://www.regular-expressions.info/dotnet.html

thanks, how would I pick up on the line breaks and spaces, would it be possile to show me an example?

iamthwee Aug 26th, 2008 6:58 am
Re: Screen Scrape remove spaces/line breaks between specified tags
 
Ever heard of Trim(). Use it!

webfort Aug 26th, 2008 7:50 am
Re: Screen Scrape remove spaces/line breaks between specified tags
 
Quote:

Originally Posted by iamthwee (Post 678254)
Ever heard of Trim(). Use it!

I think you have missed the point, I need to replace line breaks and spaces, if you look at the example given, trim will not do this

iamthwee Aug 26th, 2008 9:14 am
Re: Screen Scrape remove spaces/line breaks between specified tags
 
Incorrect, trim does replace line breaks and spaces. Please prove me wrong but you won't.

I have just tested it:

Dim nl As String = System.Environment.NewLine
Dim test As String = "                  " + nl + nl + "    TheText  " + nl + nl


TextBox1.Text = test  'show original string in a multiline text box
TextBox2.Text = test.Trim 'show changed string in a multiline text box

selvaganapathy Aug 26th, 2008 1:20 pm
Re: Screen Scrape remove spaces/line breaks between specified tags
 
Quote:

Originally Posted by webfort (Post 677087)
thanks, how would I pick up on the line breaks and spaces, would it be possile to show me an example?

Hi,

Please google it for parsing HTML using Regex class. You will get a lot. Once you can able to parse HTML Tags, Ultimately you have to Use String.Trim() to remove unwanted white spaces .


All times are GMT -4. The time now is 5:42 pm.

Forum system based on vBulletin Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
©2003 - 2010 DaniWeb® LLC