Is it possible to include the character < in a regular expression? I don't seem to be able to find a reference to it anywhere and am unable to build a regex with just this one character.

Recommended Answers

All 5 Replies

Is it possible to include the character < in a regular expression? I don't seem to be able to find a reference to it anywhere and am unable to build a regex with just this one character.

Yes, there is nothing special about the < character.

eg:

preg_match("/</", "<div>some text</div>", $matches);

Is there a specific way you're using <?

The most common use in PHP is:

preg_match("/<(.*?)>/", "<div>some text</div>", $matches);

which matches: <div> and </div>.

commented: Specific and to the point. Helped a lot. +1

Thanks digital-ether,
Every time I tried putting < in a regex, I found it didn't work. I'm simply trying to stop people inserting code/links in various text areas. I shall try your regex, which seems straightforward, and get back to you.
I'm wondering if I might have a problem with my keyboard.

Well digital-ether, it didn't work.

Here's my code
$comment = htmlentities($_POST);
if (preg_match("/</", $comment)) { print 'string is NOT OK!'; } else { print 'string is OK!'; }
When I use the character <, "I get string is ok".
However if I substitute , with A, then inputting A as $comment, gives me "string is not ok". ie it works.

My thanks to digital-ether, who was absolutely correct
For the benefit of others who may be experiencing regex problems, the character '<', being an html character, is changed by the use of htmlentities, which I had in my script.
When I changed the regex to "/&lt;/" instead of "/</", my if statement worked.

Regards to all
Taffd

My thanks to digital-ether, who was absolutely correct
For the benefit of others who may be experiencing regex problems, the character '<', being an html character, is changed by the use of htmlentities, which I had in my script.
When I changed the regex to "/&lt;/" instead of "/</", my if statement worked.

Regards to all
Taffd

hi Taffd,

yes, it changes to &lt; since you used htmlentities() on it.
Normally, you want to use htmlentities() last.

So do your checks first with the raw input, then use htmlentities() as the final transformation.

Eg:
Say if you use the for a forum, then when teh user submits a post, you want to check the post for HTML tags. Then you can save the post to the database or file, etc.
When its time to display the post, you then convert it to htmlentities() before display.
This preserves the original post, allows dynamic transformations on it. (Eg: if you want to say allow HTML tags later, its still there).

Just a tip...

Using a transformation before your checks can be a security hazard also, as the transformation can cause some unpredictable results. This isn't the case for your code however...

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.