I think I did post this some time ago, but can't find original thread to rehash. I have some questions regarding lifting data from a particular webpage. What makes this unusual and why I need to ask some questions is that within the tags, there is a lot of white space and the data doesn't actually sit within closed tags eg. <span>data</span>, it falls like this

<td width="55%"><div class="value">
                                            &pound;6.99 <font size="3"> </font></div>
										    
                                        </td>

With the "&pound;6.99" being what I want to extract and use. for example, this code works perfectly for a different website.

$url = 'http://www.cheapsmells.com/viewProduct.php?id=3462';
$html = file_get_contents($url);

preg_match('/<div class=\'productOurPrice\'?>(.+?)(\d+\.\d+)(.+?)?<\/div>/', $html, $match);
$out = $match[2];

Where the url is http://www.directcosmetics.com/results/products.cfm?ctype=ME&range=Hummer&code=34744 which is where the first example where the whitespace is, how can I adjust the above regex to obtain the information required, in this case literally "6.99" and nothing more. Is it possible because it's not within closed brackets?

Any help you can shed my way would be greatly appreciated.

Cheers ;D

Recommended Answers

All 5 Replies

<?php
$reg = '/<div class=\"value\">(.*)<\/div>/s';
$data='<td width="55%"><div class="value">
                                            &pound;6.99 <font size="3"> </font></div>
										    
                                        </td>';
preg_match($reg,$data,$matches);
print $matches[1];
                                        
?>

That works.
Cheers,
Naveen

that doesn't actually do what I need it to. I'll explain better. the url that is parsed through will always be different therefore you can't actually parse the brackets and the content itself.

What I need to do is physically extract the monetary value, just "6.99" without the pound sign itself.

Any ideas on how I would do this?

Many thanks :)

Okay ! after much head scratching and testing, this is what I found. (I hope the div class will remain the same! :S)

<?php
$url = 'http://www.directcosmetics.com/results/products.cfm?ctype=ME&range=Hummer&code=34744';
$html = file_get_contents($url);
$reg = '/<div class=\"value\">(.*)<font size="3">(.*)<\/font><\/div>/s';
preg_match($reg,$html,$matches);
print str_replace("&pound;","",$matches[1]);                                       
?>

Hope that helps ! :)

thank you mate. that is exactly what i need it to do. you are a legend!

You are welcome.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.