0

Taking some baby steps in php.

Here is my simple code...

<?php 
$data = file_get_contents('http://www.blankwebsite.com');
//$regex = '/<TITLE>(.+?)\<\/TITLE\>/';
$regex = '/TITLE>(.+?)TITLE/';
preg_match($regex,$data,$match);
echo "blah";
echo "<br>";
echo $match[1];
?>

The target source is basically this...

<HEAD>
<TITLE>Blank website. Blank site. Nothing to see here.</TITLE>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="keywords" CONTENT="blankwebsite, blank website">
<META NAME="description" CONTENT="This resource is provided as a free service for those patrons looking for nothing.">
<META NAME="Author" CONTENT="Traffic Names Ltd. 1999-2014 - http://www.dotcomagency.com - descriptive website names">
<META NAME="Copyright" CONTENT="Traffic Names Ltd. 1999-2014 - http://www.dotcomagency.com - descriptive dotcom domain names">
<LINK rel='stylesheet' type='text/css' href='/names/sitestyle.css'>
</HEAD>

My question is why I get the same result with both the regex used and the regex commented out?
which is...

blah
Blank website. Blank site. Nothing to see here.

Thanks for reading.

3
Contributors
4
Replies
25
Views
3 Years
Discussion Span
Last Post by Suzie999
0

Yes, the source abouve is taken from the website.

Which is a genuine site by the way.

(edit) wait, you meant my php page?

Yes, I see...

blah<br>Blank website. Blank site. Nothing to see here.</

Edited by Suzie999

1

in addition to Mr. Pritaeas's response, this

$regex = '/TITLE>(.+?)TITLE/';

gives us the expected result.

Blank website. Blank site. Nothing to see here.</

Yes, this </ is included. So, it isn't really the same result as

$regex = '/<TITLE>(.+?)\<\/TITLE\>/';

which will give us

 Blank website. Blank site. Nothing to see here.

One limitation of the regex code above is that, it will not give us anything if the title tag is written in lowercase as in html5 standard

$title = '<title>Blank website. Blank site. Nothing to see here.</title>';

to make our regex case-insensitive we can change the regex filter to

$regex = '/<title>(.*)<\/title>/i';

the above should return the title string from either

$title = '<title>Blank website. Blank site. Nothing to see here.</title>';

or

$title = '<TITLE>Blank website. Blank site. Nothing to see here.</TITLE>';

test:

    if(preg_match('/<title>(.*)<\/title>/i',$title,$matches)){
echo $matches[1];

 }

should return the title string from either of the title variables.

0

Ok yes.

With the commenetd regex I see in source..

blah<br>Blank website. Blank site. Nothing to see here.

I waqs just looking at was displayed in the browser.

Thank you folks.

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.