0

Function, which picks up from specified page all images that are larger than 50kt. The function returns arrayn, which contains the image URL and the image size in kilobytes. How do i start to do this? Do i use wget and exec? Is there easier way to do it. So first i need to download all images. Then analyze images and get url and size.

4
Contributors
7
Replies
8
Views
5 Years
Discussion Span
Last Post by cwarn23
0

There is a nice function called file_get_contents which gets the contents of a url and stores it in a variable then you can place it into a file or process it into a mysql database etc. For example

<?php
//first to specify the url
$url='http://images.daniweb.com/logo.gif';
//now to retrieve it
$imagedata=file_get_contents($url);
//now to save it
file_put_contents('image.gif');
//and image.gif will be in the same directory as your php file

And there you go. As simple as that.

0

You can, also, query the remote server for an HEAD request and check if provides Content-Length, something like:

<?php
$url = 'http://www.website.tld/image01.jpg';
$head = get_headers($url);
$length = str_replace('Content-Length: ','',$head[6]);
if($length < 50000)
{
	echo 'too small: ';
}
else
{
	echo 'ok: ';
}
echo $length;
echo "\n";
?>

And for the array you can do a loop, it's simple:

<?php
$url = array(
	'http://www.website.tld/image01.jpg',
	'http://www.website.tld/image02.jpg',
	'http://www.website.tld/image031.jpg'
	);

foreach($url as $key)
{
	$head = get_headers($key);
	$length = str_replace('Content-Length: ','',$head[6]);
	if($length >= 50000)
	{
		$result[$key] = $length;
	}
	
}

print_r($result);
?>

But you still need to grab all the images links from a specific page. Good work.
Bye :)

0

cwarn23 i know that function but problem is that we dont know what is name of image and we need to download all images not only one.

0

If you want to download more than one image then perhaps a loop might be best. For example.

<?php
//first to specify the url
$links=array(
'http://images.daniweb.com/1a.jpg',
'http://images.daniweb.com/2c.jpg',
'http://images.daniweb.com/3d.jpg',
'http://images.daniweb.com/4h.jpg',
'http://images.daniweb.com/5f.jpg',
'http://images.daniweb.com/6e.jpg',
'http://images.daniweb.com/7d.jpg');
foreach ($links AS $url) {
//now to retrieve it
$imagedata=file_get_contents($url);
//now to save it
file_put_contents(basename($url),$imagedata);
//and image.jpg will be in the same directory as your php file
}

Edited by cwarn23: n/a

0

@siina
I was looking at cwarn23 code and I tried to mix it with mine (hope is not a problem ^_^ and that works for you), this will scan the link you set and build an array of those images greater than 50kb:

<?php
$url = "http://www.website.tld"; # no ending slash
$data = file_get_contents($url);
$pattern = "/src=[\"']?([^\"']?.*(png|jpg|gif))[\"']?/i"; # search for img tags
preg_match_all($pattern, $data, $images);

function valid_url($u)
{
	if(preg_match('|^http(s)?://[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)?(/.*)?$|i', $u))	{ return true; }
	else { return false; }
}

# print_r($images); # uncomment to check $images array

$result = array();
foreach($images[1] as $key)
{
	$link = $url . $key;
	if(valid_url($link) === true)
	{
		$head = get_headers($link);
		$length = str_replace('Content-Length: ','',$head[6]);
		if($length >= 50000)
		{
			$result[$link] = $length;
		}
	}
}

if(empty($result))
{
	echo 'no data';
}else{
	print_r($result); # array to use for retrieving images
}
?>

This script is not perfect because will search only for img and object tags but not for images included by CSS, and you have to figure for: relative paths, absolute paths, complete links, external images... Right now this example works only with absoute paths, so <img src="/images/pic01.jpg" /> rather than <img src="../images/pic01.jpg" /> or <img src="http://a-website.tld/images/pic01.jpg" />

Edited by cereal: n/a

0

siina,
1) firstly you need to take all html content from site url using file_get_contents.
2) Then find all image tags from html source using preg_match_all.
3) have a loop of images array and again use file_get_contents function to grab image source and save it in your folder.

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.