0

hello,

you can help me scrap the contents of this table?

I'm interested in extracting the content of href and the text Spa.

<tr bgcolor="">
<td height="32" align="center"><a class="btn btn-mini enlace_link" style="text-decoration:none;" rel="nofollow" target="_blank" title="Ver..." href="http://freakshare.com/files/a5qgdilx/Juegos2.DVDSCRHQProper.Castellano.mp4.html"><i class="icon-download"></i><b>  Opcion   01</b></a></td>
<td align="left"><img src="http://www.google.com/s2/favicons?domain=freakshare.com" width="16" />freakshare</td>
<td align="center"><img src="http://www.yaske.net/theme/01/data/images/flags/es_es.png" width="19">Spa.</td> 
<td align="center" class="center"><span title="" style="text-transform:capitalize;">dvd screener</span></td>
<td align="center"><div class="star_rating" title="DVD SCREENER ( 3.5 de 5 )">
 <ul class="star"><li class="curr" style="width: 70%;"></li></ul>
</div></td> <td align="center" class="center">1 link</td> </tr>

very thanks.

Edited by DjFumon

3
Contributors
7
Replies
48
Views
3 Years
Discussion Span
Last Post by DjFumon
0

Might be easier to achieve with Jquery, immediately I was thinking using php's Dom but not sure how flexible it is.

0

js/JQuery can handle this well. preg_ functions may be able to help in php, but they are not well suited to extract from html. You'd probably do better with an XML parser.

0

i have this but i need scrap the content of the href

preg_match_all('/(<a class="btn btn-mini enlace_link" style="text-decoration:none;" rel="nofollow" target="_blank" title="Ver..." href=".*"><i class="icon-download"><\/i><b>.*<\/b><\/a>)/',$scraped_page,$res);
$ress = $res[0];

foreach($ress as $output) { 

echo $output.'<br/>'; 

} 

Edited by DjFumon

0
<?php
$doc = new DOMDocument();
$doc->strictErrorChecking = false;
libxml_use_internal_errors(true);
$doc->recover=true;
//initialise $filename below or hardcode filename
$doc->load($filename);
$finder = new DOMXpath($doc);
$anchors = $finder->query('//a[@href]');

foreach($anchors as $anchor){
  $href = $anchor->getAttribute('href');
  echo $href;
}
?>

Should work

0

not work. i have this:

<?php 
    $scraped_page = file_get_contents("http://www.yaske.net/es/pelicula/0002470/ver-the-hobbit-the-desolation-of-smaug-online.html"); 



preg_match_all('/(<td height="32" align="center"><a class="btn btn-mini enlace_link" style="text-decoration:none;" rel="nofollow" target="_blank" title="Ver..." href=".*"><i class="icon-download"><\/i><b>.*<\/b><\/a><\/td>)/',$scraped_page,$res);

$ress = $res[1];

foreach($ress as $output)
{
    echo $output.'<br/>';
}


    ?> 

but the output is

Opcion 01
Opcion 02
Opcion 03
Opcion 04
Opcion 05
Opcion 06
Opcion 07

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.