We're a community of 1077K IT Pros here for help, advice, solutions, professional growth and fun. Join us!
1,076,174 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Start New Discussion Reply to this Discussion

Objective Objective C: Find and copy substings from NSString.

Hi,

I have a piece of code, which gets a string of code from a webpage. It's the HTML source code, from which I want to make an array, in which I can find data the user will input. However, I need to extract all the useful information from the array, and discard the useless. How to search for a two substrings in a string, and copy the string in between?

My code:

NSString *googleString = @"http://www.mypage.com"; 
NSURL *googleURL = [NSURL URLWithString:googleString];
NSError *error;
NSString *googlePage = [NSString stringWithContentsOfURL:googleURL 
                                                encoding:NSASCIIStringEncoding
                                                   error:&error];

returns something like:

<HTML>
<BODY>
<TABLE style="border: Solid 1px Black; border-collapse: collapse; font-family: arial; width: 100%;">
<tr>  <td BGCOLOR="DCDCDC" NOWRAP style="border: Solid 1px Black; font-family: arial; padding: 2px; width: 100%;">
<A HREF ="1.htm" target="main">H4A</A>
  </td>
  <td BGCOLOR="DCDCDC" NOWRAP style="border: Solid 1px Black; font-family: arial; padding: 2px; width: 100%;">
<A HREF ="1.htm" target="main">Aanen </A>
  </td>
  <td BGCOLOR="DCDCDC" NOWRAP style="border: Solid 1px Black; font-family: arial; padding: 2px; width: 100%;">

<A HREF ="1.htm" target="main">Joeri</A>
  </td>
</tr>
<tr>  <td BGCOLOR="DCDCDC" NOWRAP style="border: Solid 1px Black; font-family: arial; padding: 2px; width: 100%;">
<A HREF ="2.htm" target="main">H4A</A>
  </td>
  <td BGCOLOR="DCDCDC" NOWRAP style="border: Solid 1px Black; font-family: arial; padding: 2px; width: 100%;">
<A HREF ="2.htm" target="main">Ali </A>
  </td>

  <td BGCOLOR="DCDCDC" NOWRAP style="border: Solid 1px Black; font-family: arial; padding: 2px; width: 100%;">
<A HREF ="2.htm" target="main">Sohail</A>
  </td>
</tr>
<tr>  <td BGCOLOR="DCDCDC" NOWRAP style="border: Solid 1px Black; font-family: arial; padding: 2px; width: 100%;">
<A HREF ="3.htm" target="main">H4A</A>
  </td>
  <td BGCOLOR="DCDCDC" NOWRAP style="border: Solid 1px Black; font-family: arial; padding: 2px; width: 100%;">
<A HREF ="3.htm" target="main">Beerthuijzen </A>

  </td>
  <td BGCOLOR="DCDCDC" NOWRAP style="border: Solid 1px Black; font-family: arial; padding: 2px; width: 100%;">
<A HREF ="3.htm" target="main">Iris</A>
  </td>
</tr>

and so on...

Maybe someone has a better idea? I need to get the strings: Name, Sirname, and page number (Example (The last one): Iris, Beerthuijzen, 3).

Thanks in advance!

3
Contributors
1
Reply
1 Month
Discussion Span
1 Year Ago
Last Updated
2
Views
hiddepolen
Posting Whiz in Training
297 posts since Oct 2010
Reputation Points: 82
Solved Threads: 36
Skill Endorsements: 1

HTML's are supposed to follow strict xml syntax (not all web pages do so) But the html that your have posted does follow xml syntax so you can use xml parsers and extract data from this. If the page is not xml complaint then you might have to use reg-ex pattern match to extract data. I used this when I first started with reg-ex

Prabakar
Posting Whiz
342 posts since May 2008
Reputation Points: 94
Solved Threads: 33
Skill Endorsements: 0

This article has been dead for over three months: Start a new discussion instead

Post: Markdown Syntax: Formatting Help
 
You
View similar articles that have also been tagged:
 
© 2013 DaniWeb® LLC
Page rendered in 0.5181 seconds using 2.65MB