Hi,

my problem is in regular expression. I need to catch from html code hyperlink tag's href atribute content if this hyperlink contains search condition.
here is snippet:

$a = "nail"; //search ctriteria between <a></a>
$s =<<<EOF
<a class="level2" onmouseout="this.style.background = '';" onmouseover="this.style.background ='#2c7cf4';" href="index.php?item&amp;module=1&amp;category=11" style="">  kaut kas cits </a>
<img> </img>
<br>
<a class="level2" onmouseout="this.style.background = '';" onmouseover="this.style.background ='#2c7cf4';" href="index.php?item&amp;module=1&amp;category=13" style="">  nails </a>
<a class="level2" onmouseout="this.style.background = '';" onmouseover="this.style.background ='#2c7cf4';" href="index.php?item&amp;module=1&amp;category=15" style="">  kaut kas cits2 </a>
EOF;
//this is how far i figured it out
$regexp = "#<a\s[^>]*href=\"([^\"]*)\"[^>]*>(.*{$a}+.*)</a>#siU";

/* rule explanation
# - store pattern
\s - whitespace char
. - any char (exclude newline)
* - 0 or more
? - match minimal string combo
s - (PCRE_DOTALL) modifier
i - match UPPER or lower case letters
^ - not
U - (PCRE_UNGREEDY) modif.
*/
//finding all hyperlinks
preg_match_all($regexp, $s, $matches);
// result must be 'index.php?item&amp;module=1&amp;category=13'
var_dump($matches);

it would be nice if you help me out with this regexp. Thanks!

Recommended Answers

All 2 Replies

Hi,

my problem is in regular expression. I need to catch from html code hyperlink tag's href atribute content if this hyperlink contains search condition.

Here's what you want:

$a = "nail"; //search ctriteria between <a></a>
$s =<<<EOF
<a class="level2" onmouseout="this.style.background = '';" onmouseover="this.style.background ='#2c7cf4';" href="index.php?item&amp;module=1&amp;category=11" style="">  kaut kas cits </a>
<img> </img>
<br>
<a class="level2" onmouseout="this.style.background = '';" onmouseover="this.style.background ='#2c7cf4';" href="index.php?item&amp;module=1&amp;category=13" style="">  nails </a>
<a class="level2" onmouseout="this.style.background = '';" onmouseover="this.style.background ='#2c7cf4';" href="index.php?item&amp;module=1&amp;category=15" style="">  kaut kas cits2 </a>
EOF;
//this is how far i figured it out
$regexp = "/<a[^>]*href=\\\"([^\\\"]*)\\\"[^>]*>[^<]*".$a."[^<]*<\\/a>/miU";
//finding all hyperlinks
preg_match_all($regexp, $s, $matches);
// result must be 'index.php?item&amp;module=1&amp;category=13'
print_r($matches);

OUTPUT:

Array
(
    [0] => Array
        (
            [0] => <a class="level2" onmouseout="this.style.background = '';" onmouseover="this.style.background ='#2c7cf4';" href="index.php?item&amp;module=1&amp;category=13" style="">  nails </a>
        )

    [1] => Array
        (
            [0] => index.php?item&amp;module=1&amp;category=13
        )

)

You can then just read off the match(es) from $matches[1].

So, in other words, $matches[1][0] is your first match, $matches[1][1] is your second match (if it exists), $matches[1][2] is your third match (if it exists), and so on.

Thanks for your answer. This is what i figured out yesterday evening.

$regexp = "#<a\s[^>]*href=\"([^\"]*)\"[^>]*>\s*{$criteria}\s*</a>#siU";

which is pretty similar to yours regexpr :-)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.