943,708 Members | Top Members by Rank

Ad:
  • VB.NET Discussion Thread
  • Unsolved
  • Views: 3775
  • VB.NET RSS
Aug 23rd, 2008
0

Screen Scrape remove spaces/line breaks between specified tags

Expand Post »
Hi,

I'm doing a screen scrape of a web page, which works with out any problems

What I want to do is replace the contents of tag, I can do this if the tag match exactly but in this page there are allot of blank spaces.

lbltest.Text contains the page being scrapped. The tag is formatted like this

VB.NET Syntax (Toggle Plain Text)
  1. <li class="thisclass">
  2.  
  3. TheText
  4.  
  5. </li>

I can't to a simple replace because of all the spaces. So I need to get it to look like this
VB.NET Syntax (Toggle Plain Text)
  1. <li class="thisclass">TheText</li>

Any ideas how I might do this?

Thanks in advance
Reputation Points: 10
Solved Threads: 0
Newbie Poster
webfort is offline Offline
4 posts
since Aug 2008
Aug 24th, 2008
0

Re: Screen Scrape remove spaces/line breaks between specified tags

Hi,
you specify, what method you are using to Scrape the Page . Have you heard Regex class?
Reputation Points: 44
Solved Threads: 101
Posting Pro
selvaganapathy is offline Offline
547 posts
since Feb 2008
Aug 24th, 2008
0

Re: Screen Scrape remove spaces/line breaks between specified tags

Hi,

This is the method I used:-
http://www.dotnetjohn.com/articles.aspx?articleid=93

Not heard of that class
Reputation Points: 10
Solved Threads: 0
Newbie Poster
webfort is offline Offline
4 posts
since Aug 2008
Aug 24th, 2008
0

Re: Screen Scrape remove spaces/line breaks between specified tags

Regex is a Class that used for Regular Expressions. It is useful for Parsing.

For more detail, refer http://www.regular-expressions.info/dotnet.html
Reputation Points: 44
Solved Threads: 101
Posting Pro
selvaganapathy is offline Offline
547 posts
since Feb 2008
Aug 24th, 2008
0

Re: Screen Scrape remove spaces/line breaks between specified tags

Regex is a Class that used for Regular Expressions. It is useful for Parsing.

For more detail, refer http://www.regular-expressions.info/dotnet.html
thanks, how would I pick up on the line breaks and spaces, would it be possile to show me an example?
Reputation Points: 10
Solved Threads: 0
Newbie Poster
webfort is offline Offline
4 posts
since Aug 2008
Aug 26th, 2008
0

Re: Screen Scrape remove spaces/line breaks between specified tags

Ever heard of Trim(). Use it!
Featured Poster
Reputation Points: 1536
Solved Threads: 431
Posting Expert
iamthwee is offline Offline
5,865 posts
since Aug 2005
Aug 26th, 2008
0

Re: Screen Scrape remove spaces/line breaks between specified tags

Click to Expand / Collapse  Quote originally posted by iamthwee ...
Ever heard of Trim(). Use it!
I think you have missed the point, I need to replace line breaks and spaces, if you look at the example given, trim will not do this
Reputation Points: 10
Solved Threads: 0
Newbie Poster
webfort is offline Offline
4 posts
since Aug 2008
Aug 26th, 2008
0

Re: Screen Scrape remove spaces/line breaks between specified tags

Incorrect, trim does replace line breaks and spaces. Please prove me wrong but you won't.

I have just tested it:

VB.NET Syntax (Toggle Plain Text)
  1. Dim nl As String = System.Environment.NewLine
  2. Dim test As String = " " + nl + nl + " TheText " + nl + nl
  3.  
  4.  
  5. TextBox1.Text = test 'show original string in a multiline text box
  6. TextBox2.Text = test.Trim 'show changed string in a multiline text box
Last edited by iamthwee; Aug 26th, 2008 at 10:22 am.
Featured Poster
Reputation Points: 1536
Solved Threads: 431
Posting Expert
iamthwee is offline Offline
5,865 posts
since Aug 2005
Aug 26th, 2008
0

Re: Screen Scrape remove spaces/line breaks between specified tags

Click to Expand / Collapse  Quote originally posted by webfort ...
thanks, how would I pick up on the line breaks and spaces, would it be possile to show me an example?
Hi,

Please google it for parsing HTML using Regex class. You will get a lot. Once you can able to parse HTML Tags, Ultimately you have to Use String.Trim() to remove unwanted white spaces .
Reputation Points: 44
Solved Threads: 101
Posting Pro
selvaganapathy is offline Offline
547 posts
since Feb 2008

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in VB.NET Forum Timeline: Unable to create an excel file in VB.net
Next Thread in VB.NET Forum Timeline: 2 Questions from a student





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC