Hello,

I'm trying to scrape a website using XPath and am running into a little trouble. This is the first time I've played with XPath so I'm a little rusty :/

The relevant source code of the website I'm trying to scrape is:

<span id="ctl00_ContentPlaceHolder1_lblEvents">
	<div class="contentmain">
		<div class="textArea1">
			<strong>14 June 2010 - 14 September 2010</strong>
			<br />
			<a href="events.aspx?evID=6591" class="events">Davies Display</a>
			<br />
			<strong>Pontypridd 2010</strong>
			<br />
		</div>
		.. Theres more of these 'textArea1' divs, and the structure of them is the same as the one above.
	</div>
	... Again, there's more of these 'contentmain' div's which contain other textArea1 divs.
</span>

So far, I have created the following code which gets all the 'contenmain' divs.

// Get the whole page source in order to filter out events
	$RCT_Source = new DOMDocument;
	$RCT_Source->loadHTMLFile('http://domain.co.uk/events.aspx');
	
	$XPath = new DOMXPath($RCT_Source);

	$Event_List = $XPath->query("//span[@id='ctl00_ContentPlaceHolder1_lblEvents']/div[@class='contentmain']");
	
	foreach ($Event_List as $Event) {

		
	
	}

But here's where I'm stuck.

What I need to do now is foreach of the $Events - fetch all the 'textArea1' divs and grab all of the data inside that div. (The data within the <strong> tags, <a> tags etc inside the div.)

Please reply if you'd like more info.

If you could provide any help what-so-ever, it'll be much appreciated.

Thanks.

please give more details..
this detail is not sufficient and little confusing too..


specify clearly what you want..

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.