Hello everyone, i read all the discussion about "web scraping" here in DaniWeb forum but i didn't found a solution to my problem.
I have to extract "title" and "content" of news from a website. I wrote (after reading a lot of "tutorial") these lines:
$dom = new DOMDocument(); $dom->load('http://www.php.net'); $title = $dom->getElementsByTagName('h2'); for ($i = 0; $i < $title->length; $i++) echo $title->item($i)->nodeValue . "<br/>"; ?>
Everything works fine printing all "h2" content. Anyway if i need to scrap other elements from the page, i tried to create another variable called $content and add a new foreach but it doesn't work.
I think this is not the best way to create a web-scraper for the url that i have to scrap, and i ask if someone could provide me some tutorial to understad better everything, or suggest me a php lib easy to use. I read also the tutorial on www.php.net and googling around but i still have some doubt.