0

Hi Guys

I want to write a program that extracts links from a url and then adds them to a mysql database.

I have found some nice examples ie

<?
function getlinks($url) {
    $data=file_get_contents($url);
    preg_match_all('/(href|src)\=(\"|\')[^\"\'\>]+/i',$data,$media);
    unset($data);
    $data=preg_replace('/(href|src)(\"|\'|\=\"|\=\')(.*)/i',"$3",$media[0]);
    return $data;
    }

//now to use the function
echo "<xmp>";
var_dump(getlinks('http://www.google.com.au'));
echo "</xmp>";
?>

Are there other methods that I can use and how do I get the output into a database?

Thank You

4
Contributors
8
Replies
9
Views
7 Years
Discussion Span
Last Post by diafol
0

I guess that getlinks() returns an array with all links in a page. So if you go through that array with a for-loop you can simply save them in a database with a regular SQL-expression.

0

Thanks a lot for your response, but I am a bit of a newbie at php would you be able to put the above statement into code and elaborate a bit on it pleeeeeeeeeeeeeeeeeeezzze

0
foreach($data as $link){
mysql_query("INSERT INTO table_name (link, something, something_else) VALUES('".$link."','blah1','blah2')");
}

basically when you have your $data it is an array()

that means $data = array("http://link1.com", "http://link2.com", "and so on...");

foreach just goes over each item in data array and does something for each item in the array specified by whatever is in the {}

in this case

we are inserting each link in the db.

hope this sheds a little light on the situation.

0

Like this? How would you change it?

<?php

$url = "http://www.google.co.za",

function getlinks($url) {
    $data=file_get_contents($url);
    preg_match_all('/(href|src)\=(\"|\')[^\"\'\>]+/i',$data,$media);
    unset($data);
    $data=preg_replace('/(href|src)(\"|\'|\=\"|\=\')(.*)/i',"$3",$media[0]);
    return $data;
    }

$con = mysql_connect("localhost","root","");
if (!$con)
  {
  die('Could not connect: ' . mysql_error());
  }

mysql_select_db("temp", $con);

mysql_query($sql,$con);

foreach($data as $link){
mysql_query("INSERT INTO first (link)
 VALUES('".$link."')");
}

mysql_close($con);


?>

Thanks for your input :-)

0

on line 21 u are executing a mysql query but i dont see where you are defining what $sql is or why you are doing it. i think you can remove line 21

0

Hi guys

Would anyone happen to know why this program is not working:

<?php

function getlinks($url) {
    $data=file_get_contents($url);
    preg_match_all('/(href|src)\=(\"|\')[^\"\'\>]+/i',$data,$media);
    unset($data);
    $data=preg_replace('/(href|src)(\"|\'|\=\"|\=\')(.*)/i',"$3",$media[0]);
    return $data;
    }


$con = mysql_connect("localhost","root","");
if (!$con)
  {
  die('Could not connect: ' . mysql_error());
  }
mysql_select_db("temp", $con);
$sql = "CREATE TABLE first
(
link NOT NULL
)";

getlinks('http://www.google.com');

mysql_select_db("temp", $con);

foreach($data as $link){
$tbl = ("INSERT INTO first (link)
 VALUES('"$link"')");
}

mysql_query($con, $sql, $tbl);

mysql_close($con);


?>

Thanks

0

You have some sql in there that is saying to create a table but you are not actually executing that query so i assume table "table" doesnt exist.
also the params for your mysql_query are wrong. it should be mysql_query($query, $connection); you should create a table in phpmyadmin and then just manipulate the data in that table from your scripts instead of creating the table from your scripts.

0

$data=file_get_contents($url);

This is deemed unsafe by many hosts when used in a x-domain fashion, so they disable it. Check your phpinfo() to see if it is disabled or not. Just coz it works for you on localhost, doesn't mean that it'll work when you run on your remote server.

BTW - this could open you up to a bunch of nasties - be careful!

If it is disabled, a workaround would be to use cURL functions. See the php manual for this - some good examples there.

Edited by diafol: n/a

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.