I'm trying to understand XPath, but I've come acrost an issue I can not seam to find an answer for. In this case it seams that XPath is not returning what it should.

I've got a sample html file, test.html:




And my PHP file, test.php

echo "<pre>";
$url = "test.html";

$oldSetting = libxml_use_internal_errors(true);

$html = new DOMDocument();
$xpath = new DOMXPath($html);

$titles = $xpath->query("//p");
foreach ($titles as $title){
echo $title->nodeValue."<br />";


echo "</pre>";

I can set the xpath query to //p and get all the p tags content on screen. That's good.
Set to /html//p I get the same. That's good.
Set //p[1] I get the first p tag. That's good.
Set to //p[5] I get the 5th p tag. That's good.

That's all groovy.

But if I do /html/div/p I get nothing. I've messed with a ton of similar queries with no luck.

I'm trying to read the url of an image from a website, and using Firefox's Firebug plugin I can copy the Xpath and I get something like


But in PHP I get no result unless I remove all the "[2]", take out some of the div's and place a // before img.

So what's going on here, every example I've read says this is correct, but in the very very simple example above just a simple /html/div/p or /html/div//p does not work.

Thanks for your help!

Possibly has something to do with the loadHtmlFile. If I use load (since your HTML is well-formed, using $titles = $xpath->query("//html/div/p"); works as expected.

