Hello,

I have 3 text boxes that users can enter there AIM, YIM nickname and MSN email. Once submittd it sends and stores in MySQL db.

I will use mysql_real_escape_string() of course but am not sure what is the best way to ensure no other mailicious code has been entered.

I checked and due to the 3 above allowing special characters it means i cannot use striptags and stripslashes.

They will show on a profile page so i need to ensure users cannot enter html code that will output it on page and break page etc and any other mailicous code.

So my quetion is what is the best way to validate the three above using php apart from the obvious mysql_real_escape_string() ?

I know i could use regex but i dont understand it enough to make three complex regexe(s). It would be interesting to know as obviously majority of the forums allow people to add there contact details like aim, yim, msn so i wonder how they validate them.

Any help would be appreciated.

Thanks for reading.

Recommended Answers

All 8 Replies

Hey.

You would probably be safest using Parameterized Statements. That way the database driver itself secures the input and makes sure it doesn't contain characters that would mess up the query.

For example, using the mysqli extension. (Taken from Wikipedia)

$db = new mysqli("localhost", "user", "pass", "database");
$stmt = $db -> prepare("SELECT priv FROM testUsers WHERE username=? AND password=?");
$stmt -> bind_param("ss", $user, $pass);
$stmt -> execute();

As to displaying them safely...

MSN is easy. You can find a regular expressions to verify emails all over the place.
But I'm not sure about AOL or YIM. I don't know how the usernames look for those two. Could you give us examples?

However, to begin with, you could preg_replace all the <script>, <img> and <a> tags out using something like:

$regexp = '#<(script)[^>]*>([^<]*</\s*\\1\s*>)?|</?a[^>]*>|<img[^/>]*/?>#i';
$safe = preg_replace($regexp, '', $input);

Hi,

Thanks for the replies. In regards to the first reply. I had a look but don't fully understand it.

Two the second reply:

I just checked and for AIM it can be between 3 and 16 alphanumeric characters in length and must begin with a letter

For YIM:

Use 4 to 32 characters and start with a letter. You may use letters, numbers, underscores, and one dot (.)

Althou i agree use regular expressions, problem i have is some of them are either dodgey and not 100% working or hear some have security loopholes so i don't know what is safe and works because i don't understand them fully, very limited understanding.

Thanks for any help.

I just checked and for AIM it can be between 3 and 16 alphanumeric characters in length and must begin with a letter

For YIM:

Use 4 to 32 characters and start with a letter. You may use letters, numbers, underscores, and one dot (.)

Ok, those are pretty simple to validate. You could use these regular expressions:

$aim_regexp = '/\w[\w\d]{2,15}/i';
$yim_regexp = '/\w[\w\d_\.]{3,31}/i';

Althou i agree use regular expressions, problem i have is some of them are either dodgey and not 100% working or hear some have security loopholes so i don't know what is safe and works because i don't understand them fully, very limited understanding.

Those two I posted are very simple, relatively speaking.

Basically, you start by choosing a delimiter; a single character to mark the start and end of the pattern. In the two expressions above, I used the / char. In the one I posted earlier, I used # . /\w[\w\d]{2,15}/i . #<(script)[^>]*>([^<]*</\s*\\1\s*>)?|</?a[^>]*>|<img[^/>]*/?>#i .

To the far right, after the second delimiter, you can specify a number of control characters. In all of my preceding expressions, I used the i character. It makes the patter case-insensitive. /w[\w\d]{2,15}/i .

So the outer shell of both of the above expressions is //i .
If we take the AIM expression as an example, we need to match a single "letter" followed by 2-15 alpha-numeric characters.

In a regular expression \w represents all letters, and \d represents all numbers. So to match a single letter I do /\w/i .

Next I need to match 2-15 letters or numbers. I can use square-brackets to "group" characters, so that the engine will look for any character in the group. So to look for letters or numbers I create a group like [\w\d] . Put that together with what we had before and we get /\w[\w\d]/i .

That will only match a single letter, followed by a single letter or a number. We can use curly-brackets to indicate how many of a certain character we want matched. Adding {2,15} to a character, or a character group, tells the expression we want between 2 and 15 of those matched.

Adding that to our earlier patter we get /\w[\w\d]{2,15}/i . This will match a single letter, followed by 2-15 letters or numbers, which is what the AIM spec you found tells us.

commented: Perfect. Could not ask for better support. +1

Wow Alti :)

Thanks ever so much for explaining and doing the regex patterns for me.

Superb thanks ever so much :)

Regards,
Mathew

Hi Alti,

Both the AIM and YIM dont seem to work.

If i enter just characters like genieuk it says its invalid i also tried letters and numbers and followed the regex rule and says invalid.

Any suggestions?

if ( !empty($aim) && !preg_match('/\w[\w\d]{2,15}/i')) {
		 $aim_invalid .= "<p><span class=\"error\">Your <b>AIM</b> nickname is not valid</span></p>";
		 $err++;
	}
	
if ( !empty($yim) && !preg_match('/\w[\w\d_\.]{3,31}/i')) {
		 $yim_invalid .= "<p><span class=\"error\">Your <b>AIM</b> nickname is not valid</span></p>";
		 $err++;
	}

Thanks.

You left out the second parameter of the preg_match function.
It should be:

if (!empty($aim) && !preg_match('/\w[\w\d]{2,15}/i', $aim)) {
	 $aim_invalid .= "<p><span class=\"error\">Your <b>AIM</b> nickname is not valid</span></p>";
	$err++;
}

But!.. I did realize one thing that will affect the regular expressions.
I was under the impression that \w represented [a-zA-Z] , but in reality it appears to represent [a-zA-Z0-9_] .

Which means both the expressions need to be changed:

$aim_regexp = '/^[a-z][a-z0-9]{2,15}$/i';
$yim_regexp = '/^[a-z][\w\.]{3,31}$/i';

When creating character groups, you can use a dash ( - ) to indicate a range of characters. So [a-z0-9] means: All characters from a to z, and all numbers.

Also, note that I added ^ at the front of the patterns, and $ at the end. Those represent the beginning and end of the input, respectively. Adding them like that means that the pattern must match the entire input string, or it will fail.

In this case it ensures that there is nothing but the name in the input string. Without these "anchors", you could sneak in extra text before and after the name. (Forgot to add them in my last post. Sorry :))

You left out the second parameter of the preg_match function.
It should be:

if (!empty($aim) && !preg_match('/\w[\w\d]{2,15}/i', $aim)) {
	 $aim_invalid .= "<p><span class=\"error\">Your <b>AIM</b> nickname is not valid</span></p>";
	$err++;
}

But!.. I did realize one thing that will affect the regular expressions.
I was under the impression that \w represented [a-zA-Z] , but in reality it appears to represent [a-zA-Z0-9_] .

Which means both the expressions need to be changed:

$aim_regexp = '/^[a-z][a-z0-9]{2,15}$/i';
$yim_regexp = '/^[a-z][\w\.]{3,31}$/i';

When creating character groups, you can use a dash ( - ) to indicate a range of characters. So [a-z0-9] means: All characters from a to z, and all numbers.

Also, note that I added ^ at the front of the patterns, and $ at the end. Those represent the beginning and end of the input, respectively. Adding them like that means that the pattern must match the entire input string, or it will fail.

In this case it ensures that there is nothing but the name in the input string. Without these "anchors", you could sneak in extra text before and after the name. (Forgot to add them in my last post. Sorry :))

Thanks so much once again. Works great now :) . Excellent support from you Alti.

Thanks as always.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.