Search a div using php

Question

bipies 0 Light Poster

12 Years Ago

Hi everybody!

Well my question is as follows, I need to search for a div with a specific ID using php under wordpress, till now I found images and links, but I need to find a specific div ID and copy it, for now tested with this:

$html->find('$html->find('#divid');');
$html->find('div[id=divid]');

and several more but with no results m I doing something wrong?

Inside this id there is this kind of code:

....%3Fe%3D1309211119%26ri%3D1024%26rs%3D85%26h%3.....

Any idea?

Thanks ind advance!

div dom find php seo

6 Contributors
32 Replies
506 Views
4 Days Discussion Span
Latest Post 12 Years Ago Latest Post by bipies

All 32 Replies

cjohnweb 14 User Title? What's that?

12 Years Ago

That's because of the encapsulated apostrophes.

I'm not the best with XPath myself, but I've used it for some pretty slick things. I got the following code from (http://us.php.net/manual/en/class.domxpath.php) and I edited it to pull the content of any div with an id of player.

You may need to find another way to load the file you are working with though (save file_get_contents($file); as a file, then open it with the LoadHTMLFile function).

<?php
  
  $file = "file.txt"; // xml formatted text file...
  $doc = new DOMDocument();
  $doc->loadHTMLFile($file);
  
  $xpath = new DOMXpath($doc);
  
  // example 1: for everything with an id
  //$elements = $xpath->query("//*[@id]");
  
  // example 2: for node data in a selected id
  //$elements = $xpath->query("/html/body/div[@id='yourTagIdHere']");
  
  // example 3: same as above with wildcard
  $elements = $xpath->query("*/div[@id='player']");
  
  if (!is_null($elements)) {
    foreach ($elements as $element) {
  //    echo "<br/>[". $element->nodeName. "]";
  
      $nodes = $element->childNodes;
      foreach ($nodes as $node) {
        echo $node->nodeValue. "\n";
      }
    }
  }

?>

TySkby 41 Junior Poster

12 Years Ago

*Edit- I understand now after thoroughly reading page 2.

I'm going to do some hunting- I wrote a PHP application that does almost this exact same thing. When I find it, I'll put a link to the source files if you'd like.

Edited 12 Years Ago by TySkby because: n/a

cjohnweb 14 User Title? What's that?

12 Years Ago

XPath is specifically used for parsing the DOM. It really depends on the application you are trying to do. I can scripts with PHP, Curl and Xpath that can do anything from logging in to sites, scraping information, heck I could make a Google bot clone that follows links, saves email addresses, you name it. Id say that in this circumstance it depends on what language you are more familiar with, because I'm sure, like you are saying, Javascript could do this just as well as PHP, or even ASP.

cjohnweb 14 User Title? What's that?

12 Years Ago

I'm not sure I understand, but yeah you are almost done it looks.

Give me a sample URL that I can test code with.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

diafol · Answer 1 · 2011-06-28T03:29:29+00:00

What class are you using for the $html->find? Not sure how this method is supposed to work. What does find return? What are the allowed parameters? In what format are the parameters?

bipies 0 Light Poster · Answer 2 · 2011-06-28T03:42:35+00:00

wow, tons of questions ;)

I'm almost newbie but almost only.

Well I'm using this

Website: http://sourceforge.net/projects/simplehtmldom/

return is a string with "invalid" characters like % but I need them all, the hole string is it possible?

ko ko 97 Practically a Master Poster · Answer 3 · 2011-06-28T19:00:49+00:00

You tested with what ? This one ?

$html->find('$html->find('#divid');');
$html->find('div[id=divid]');

Does it correct use ? In my opinion, your class wont' work while it is quoting, the first line. Furthermore, why the class is inside the same class as parameter.

Perhaps, you would not probably well read the instruction. I don't know what that snippet would suppose which function.

bipies 0 Light Poster · Answer 4 · 2011-06-28T19:36:35+00:00

Hi

I tested with both and nothing was returned,

and others like;

$html->find('a[class=miniature]');
$html->find('span[class=red]');
$html->find('img[class=borderx]');

are working fine, it must be something "stupid" but don know what, :(

PS: all them are precede by a variable, like: $picture = $html->find('img[class=borderx]');

bipies 0 Light Poster · Answer 5 · 2011-06-28T19:41:25+00:00

okokok, I've foounf this inside the code mentioned before:

protected $self_closing_tags = array('img'=>1, 'br'=>1, 'input'=>1, 'meta'=>1, 'link'=>1, 'hr'=>1, 'base'=>1, 'embed'=>1, 'spacer'=>1);

wich, I supose, works with images, so I want to work with videos flv, wich are the strings I am searching for, what do you think?

twiss 155 Veteran Poster · Answer 6 · 2011-06-28T20:15:45+00:00

twiss 155 Veteran Poster

12 Years Ago

Perhaps try:

$html->find('#divid');

bipies 0 Light Poster · Answer 7 · 2011-06-28T23:28:12+00:00

no way :(
But thinking in another way, if I know how a string starts (file=) and how it ends (.flv) is there a way to copy/retrive the hole string?

diafol · Answer 8 · 2011-06-28T23:33:42+00:00

A string with a hole in it?

Interesting class. Did you check the documentation; http://simplehtmldom.sourceforge.net/manual.htm ?

bipies 0 Light Poster · Answer 9 · 2011-06-28T23:50:35+00:00

I've check it but I think thah I'm messing my self :)

maybe is... echo $html->getElementById("div1")->???

checking right now

twiss 155 Veteran Poster · Answer 10 · 2011-06-29T01:09:42+00:00

Yes, but looking at the source, that's just a wrapper of:

$html->find('#div1', 0);

Where the 0 makes it return the first one (I think, which seems useless to me, but they also included a getElementsById function, which returns all elements with the ID. Go figure).

cjohnweb 14 User Title? What's that? · Answer 11 · 2011-06-29T03:05:36+00:00

Give XPath a try, here is a sample I have in my personal library:

txt-file.txt

<?xml version="1.0"?>

<body>
This line of information will be pulled because it is in between the body tags!
</body>

And the PHP Code:

<?php
$filename = "txt-file.txt"; // xml formatted text file...

// open the file and load contents into $string
$fh = fopen($filename, "r") or die("Can't open file");
$string = fread($fh, filesize($filename)); 
fclose($fh);

// Get it ready for XPath
$xml = new SimpleXMLElement($string);

// Specify your XPath query / expression
$result = $xml->xpath('/body');

// Loop through each result XPath has returned
while(list( ,$node) = each($result)) {
    echo '/body: ',$node,"\n";
}

?>

So for your xpath, do something like:

<?php

$url = ""; // Set this to url

$string = file_get_contents($url);

// Get it ready for XPath
$xml = new SimpleXMLElement($string);

// Specify your XPath query / expression
$result = $xml->xpath('/body/div[@id='div-id']');

// Loop through each result XPath has returned
while(list( ,$node) = each($result)) {
    echo '/body: ',$node,"\n";
}

?>

~John

bipies 0 Light Poster · Answer 12 · 2011-07-01T00:30:56+00:00

Hi!

Sorry, it says this:

Parse error: syntax error, unexpected T_STRING in /home/*****/get.php on line 11

this line is:

$result = $xml->xpath('div[@id='player']');

twiss 155 Veteran Poster · Answer 13 · 2011-07-01T00:40:20+00:00

twiss 155 Veteran Poster

12 Years Ago

Change one of the quote pairs tot double quotes.

Edited 12 Years Ago by twiss because: n/a

ko ko 97 Practically a Master Poster · Answer 14 · 2011-07-01T00:44:03+00:00

ko ko 97 Practically a Master Poster

12 Years Ago

Do as @twiss said, or escape it.

$result = $xml->xpath('div[@id=\'player\']');

bipies 0 Light Poster · Answer 15 · 2011-07-01T03:45:31+00:00

Guys I love you all! It begins to work :)

Any way, sorry if I'm messing up all the time, maybe I shpuld explain what is the goal, let's go:

I must to search inside a <div id="player" ........ </id> inside a determinated html remote file a string that begins with: flv_url= and ends with &

I don't know really if this is the method or if its possible I only know those two constants in the string but I want to get the hole string, what do you thing? Is it possible? :S

Tahnks again!

cjohnweb 14 User Title? What's that? · Answer 16 · 2011-07-01T06:41:28+00:00

If I understand properly, you want to find a <div> with an id of player, then you want to take the information out of it IF that information begins with "flv_url=" and ends with "&"?

You will have to add some code at lines 22 - 25, something like:

$nodes = $element->childNodes;
foreach ($nodes as $node) {
$line_content $node->nodeValue;

preg_match('/(flv_url=).?*(&amp;)/is',$line_content,$return);
if(!empty($return[0])){$results[] = $line_content; unset($return);}
}
?>

You will probably have to make changes to that preg_match function. I know it has to start with a "/" slash, and end with "/is", the ".?*" is like a wild card matching everything in between. I'm just not very good with regex stuff. You may also mount (I think thats what its called) that beginning with a "^" char:

preg_match("/^(flv_url=).?*(&amp;)/is",$content,$return);

You might need to escape the =, & and ; chars.

Sorry, I am out of time. I'll check back end of today.

Good luck!

bipies 0 Light Poster · Answer 17 · 2011-07-01T07:45:36+00:00

wich is the value for $line_content $node->nodeValue;

it show a "Parse error: syntax error, unexpected T_VARIABLE in"

Thanks thanks thanks ;)

twiss 155 Veteran Poster · Answer 18 · 2011-07-01T11:25:11+00:00

twiss 155 Veteran Poster

12 Years Ago

It needs an = in between.

bipies 0 Light Poster · Answer 19 · 2011-07-01T18:37:07+00:00

well, after all I think I'm "almost" there;

I think I must use preg_match function but I don't know how to use this function in a concrete url or file neither how to find the "·%·& strings that begins with file= and ends with &amp

I'm boring, I know :(

cjohnweb 14 User Title? What's that? · Answer 20 · 2011-07-01T22:30:28+00:00

I'm gonna assume you don't know PHP at all, do you?

What is the URL you are trying to scrape?

twiss 155 Veteran Poster · Answer 21 · 2011-07-01T22:40:28+00:00

twiss 155 Veteran Poster

12 Years Ago

You'll need something like this: /^flv_url=.+?&$/

bipies 0 Light Poster · Answer 22 · 2011-07-02T00:16:40+00:00

@cjohnweb I'm so n00b, sorry ;) I'm trying to do my best, but in php my knowledge is really limited

@TySkby Thanks! this is what I needed,the contruction of the preg_match! But finally it was not the beginig and the end of the string it was inside, doing this:

<?php
$html = htmlspecialchars(file_get_contents('yourURL')); 


//$html->find preg_match(/^flv_url=.+?&amp;$/)

if(preg_match('/flv_url=.+?amp;/',$html))
    echo 'FOUND';
else
    echo 'NOT FOUND';   

?>

I'm having "founds" so,,,, the last question, how can I print the result? I mean the string? or the part I'm interested in?

bipies 0 Light Poster · Answer 23 · 2011-07-02T00:19:32+00:00

@TySkby I'm grabbing videos from diferent sources for myself with my own player and allthis stuff, under wordpress and using a made engine (plugin) that makes it, but I want more sites to be grabbed (yes, adults too)

:)

Thanks to everybody and really sorry if I'm boring you ;)

bipies 0 Light Poster · Answer 24 · 2011-07-02T00:21:17+00:00

bipies 0 Light Poster

12 Years Ago

@cjohnweb ok, but following the actual line "'we are" almost done, isn' it?

bipies 0 Light Poster · Answer 25 · 2011-07-02T00:33:47+00:00

well a sample site to grab the code should be this http://freshmeat4u.com/beta/test.php

is a "brute" copy paste so videos are not working and layaout is a mess, but the code is there, WARNING adult content and its not for spam :)

Search a div using php

Recommended Answers Collapse Answers

All 32 Replies

Recommended Answers