hello,

you can help me scrap the contents of this table?

I'm interested in extracting the content of href and the text Spa.

<tr bgcolor="">
<td height="32" align="center"><a class="btn btn-mini enlace_link" style="text-decoration:none;" rel="nofollow" target="_blank" title="Ver..." href="http://freakshare.com/files/a5qgdilx/Juegos2.DVDSCRHQProper.Castellano.mp4.html"><i class="icon-download"></i><b>  Opcion   01</b></a></td>
<td align="left"><img src="http://www.google.com/s2/favicons?domain=freakshare.com" width="16" />freakshare</td>
<td align="center"><img src="http://www.yaske.net/theme/01/data/images/flags/es_es.png" width="19">Spa.</td> 
<td align="center" class="center"><span title="" style="text-transform:capitalize;">dvd screener</span></td>
<td align="center"><div class="star_rating" title="DVD SCREENER ( 3.5 de 5 )">
 <ul class="star"><li class="curr" style="width: 70%;"></li></ul>
</div></td> <td align="center" class="center">1 link</td> </tr>

very thanks.

Recommended Answers

Might be easier to achieve with Jquery, immediately I was thinking using php's Dom but not sure how flexible it is.

Jump to Post

js/JQuery can handle this well. preg_ functions may be able to help in php, but they are not well suited to extract from html. You'd probably do better with an XML parser.

Jump to Post

All 7 Replies

Might be easier to achieve with Jquery, immediately I was thinking using php's Dom but not sure how flexible it is.

Member Avatar

js/JQuery can handle this well. preg_ functions may be able to help in php, but they are not well suited to extract from html. You'd probably do better with an XML parser.

i have this but i need scrap the content of the href

preg_match_all('/(<a class="btn btn-mini enlace_link" style="text-decoration:none;" rel="nofollow" target="_blank" title="Ver..." href=".*"><i class="icon-download"><\/i><b>.*<\/b><\/a>)/',$scraped_page,$res);
$ress = $res[0];

foreach($ress as $output) { 

echo $output.'<br/>'; 

} 
Member Avatar
<?php
$doc = new DOMDocument();
$doc->strictErrorChecking = false;
libxml_use_internal_errors(true);
$doc->recover=true;
//initialise $filename below or hardcode filename
$doc->load($filename);
$finder = new DOMXpath($doc);
$anchors = $finder->query('//a[@href]');

foreach($anchors as $anchor){
  $href = $anchor->getAttribute('href');
  echo $href;
}
?>

Should work

not work. i have this:

<?php 
    $scraped_page = file_get_contents("http://www.yaske.net/es/pelicula/0002470/ver-the-hobbit-the-desolation-of-smaug-online.html"); 



preg_match_all('/(<td height="32" align="center"><a class="btn btn-mini enlace_link" style="text-decoration:none;" rel="nofollow" target="_blank" title="Ver..." href=".*"><i class="icon-download"><\/i><b>.*<\/b><\/a><\/td>)/',$scraped_page,$res);

$ress = $res[1];

foreach($ress as $output)
{
    echo $output.'<br/>';
}


    ?> 

but the output is

Opcion 01
Opcion 02
Opcion 03
Opcion 04
Opcion 05
Opcion 06
Opcion 07

Be a part of the DaniWeb community

We're a friendly, industry-focused community of 1.21 million developers, IT pros, digital marketers, and technology enthusiasts learning and sharing knowledge.