0

I would really like for someone to take a little time and look over my code. I'm parsing some news content and I can insert the initial parse into my database which contains the news URL and the title. I'd like to expand it farther, to pass along each article link and parse the content of the article and include it in my database. The initial parsing works perfectly like this:

<?php
include_once ('connect_to_mysql.php');
include_once ('simple_html_dom.php');
$html = file_get_html('http://basket-planet.com/ru/');
$main = $html->find('div[class=mainBlock]', 0);
  $items = array();
  foreach ($main->find('a') as $m){
    $items[] = '("'.mysql_real_escape_string($m->plaintext).'",
                "'.mysql_real_escape_string($m->href).'")';
  }
$reverse = array_reverse($items);
mysql_query ("INSERT IGNORE INTO basket_news (article, link) VALUES 
             ".(implode(',', $reverse))."");
?>

As you can see, I'm using PHP Simple HTML DOM Parser. To expand, I'm trying to use mysqli statement where I can bind the parameters so all the html tags get inserted into my database. I've done this before with XML parsing. Problem is I don't know how to bind the array, and see whether my code is correct, if it will work this way... Here's the entire code:

<?php
$mysqli = new mysqli("localhost", "root", "", "test");
$mysqli->query("SET NAMES 'utf8'");
include_once ('simple_html_dom.php');
$html = file_get_html('http://basket-planet.com/ru/');
//find main news
$main = $html->find('div[class=mainBlock]', 0);
$items = array();
  foreach ($main->find('a') as $m){
    $h = file_get_html('http://www.basket-planet.com'.$m->href.'');
    $article = $h->find('div[class=newsItem]');
    //convert to string to be able to modify content
    $a = str_get_html(implode("\n", (array)$article));
      if(isset($a->find('img'))){
        foreach ($a->find('img') as $img){
          $img->outertext = '';}} //get rid of images
      if(isset($a->find('a'))){
        foreach ($a->find('a') as $link){
          $link->href = 'javascript:;';
          $link->target = '';}} //get rid of any javascript
      if(isset($a->find('iframe'))){
        foreach ($a->find ('iframe') as $frame){
          $frame->outertext = '';}} //get rid of iframes
     @$a->find('object', 0)->outertext = '';
     @$a->find('object', 1)->outertext = '';
     //modify some more to retrieve only text content
     //put entire content into a div (will if statements work here???)
     $text_content = '<div>'.$a.'<br>'.
       if (!empty($a->find('object', 0)->data)){
         echo '<a target="_blank" href="'.$a->find('object', 0)->data.'">Play Video</a>&nbsp;&nbsp;';}
       if (!empty($a->find('object', 1)->data)){
         echo '<a target="_blank" href="'.$a->find('object', 1)->data.'">Play Video</a>&nbsp;&nbsp;';}
       //couple more checks to see if video links are present
    .'</div>';
$items[] = '("'.$m->plaintext.'","'.$m->href.'","'.$text_content.'")';
//reverse the array so the latest items have the last id
$reverse = array_reverse($items);
$stmt = $mysqli->prepare ("INSERT IGNORE INTO test_news (article, link, text_cont) VALUES (?,?,?)");
$stmt->bind_param ???; //(implode(',', $reverse));
$stmt->execute();
$stmt->close();
?>

So the logic is for every href of an article found, I'm passing it to parse the content and I'm trying to add it to the array. I probably have a ton of errors but I can't test it yet because I don't know how to bind it to see if it works. And I'm also not sure if I can do the if statements inside $text_content div...meaning to display "Play Video" if they exist. So please, if someone can take time to work on this with me I would really appreciate it.

2
Contributors
1
Reply
15
Views
4 Years
Discussion Span
Last Post by pritaeas
This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.