1.11M Members

Objective Objective C: Find and copy substings from NSString.

 
0
 

Hi,

I have a piece of code, which gets a string of code from a webpage. It's the HTML source code, from which I want to make an array, in which I can find data the user will input. However, I need to extract all the useful information from the array, and discard the useless. How to search for a two substrings in a string, and copy the string in between?

My code:

NSString *googleString = @"http://www.mypage.com"; 
NSURL *googleURL = [NSURL URLWithString:googleString];
NSError *error;
NSString *googlePage = [NSString stringWithContentsOfURL:googleURL 
                                                encoding:NSASCIIStringEncoding
                                                   error:&error];

returns something like:

<HTML>
<BODY>
<TABLE style="border: Solid 1px Black; border-collapse: collapse; font-family: arial; width: 100%;">
<tr>  <td BGCOLOR="DCDCDC" NOWRAP style="border: Solid 1px Black; font-family: arial; padding: 2px; width: 100%;">
<A HREF ="1.htm" target="main">H4A</A>
  </td>
  <td BGCOLOR="DCDCDC" NOWRAP style="border: Solid 1px Black; font-family: arial; padding: 2px; width: 100%;">
<A HREF ="1.htm" target="main">Aanen </A>
  </td>
  <td BGCOLOR="DCDCDC" NOWRAP style="border: Solid 1px Black; font-family: arial; padding: 2px; width: 100%;">

<A HREF ="1.htm" target="main">Joeri</A>
  </td>
</tr>
<tr>  <td BGCOLOR="DCDCDC" NOWRAP style="border: Solid 1px Black; font-family: arial; padding: 2px; width: 100%;">
<A HREF ="2.htm" target="main">H4A</A>
  </td>
  <td BGCOLOR="DCDCDC" NOWRAP style="border: Solid 1px Black; font-family: arial; padding: 2px; width: 100%;">
<A HREF ="2.htm" target="main">Ali </A>
  </td>

  <td BGCOLOR="DCDCDC" NOWRAP style="border: Solid 1px Black; font-family: arial; padding: 2px; width: 100%;">
<A HREF ="2.htm" target="main">Sohail</A>
  </td>
</tr>
<tr>  <td BGCOLOR="DCDCDC" NOWRAP style="border: Solid 1px Black; font-family: arial; padding: 2px; width: 100%;">
<A HREF ="3.htm" target="main">H4A</A>
  </td>
  <td BGCOLOR="DCDCDC" NOWRAP style="border: Solid 1px Black; font-family: arial; padding: 2px; width: 100%;">
<A HREF ="3.htm" target="main">Beerthuijzen </A>

  </td>
  <td BGCOLOR="DCDCDC" NOWRAP style="border: Solid 1px Black; font-family: arial; padding: 2px; width: 100%;">
<A HREF ="3.htm" target="main">Iris</A>
  </td>
</tr>

and so on...

Maybe someone has a better idea? I need to get the strings: Name, Sirname, and page number (Example (The last one): Iris, Beerthuijzen, 3).

Thanks in advance!

 
0
 

HTML's are supposed to follow strict xml syntax (not all web pages do so) But the html that your have posted does follow xml syntax so you can use xml parsers and extract data from this. If the page is not xml complaint then you might have to use reg-ex pattern match to extract data. I used this when I first started with reg-ex

Isn't it about time forums rewarded their contributors?

Earn rewards points for helping others. Gain kudos. Cash out. Get better answers yourself.

It's as simple as contributing editorial or replying to discussions labeled or OP Kudos

You
This article has been dead for over six months: Start a new discussion instead
Post:
Start New Discussion
View similar articles that have also been tagged: