Regex to remove 'Illegal Characters'

Please support our JavaScript / DHTML / AJAX advertiser: PostgreSQL or MySQL? Compare and contrast the two most popular open source databases
Thread Solved

Join Date: Oct 2007
Posts: 260
Reputation: Venom Rush is an unknown quantity at this point 
Solved Threads: 2
Venom Rush's Avatar
Venom Rush Venom Rush is offline Offline
Posting Whiz in Training

Regex to remove 'Illegal Characters'

 
0
  #1
23 Days Ago
Hi all

I've been trying to find a regular expression that checks if input contains any of the following characters only:

`~!@#$%^&*()-=+\|/?.>,<;:'"[{]}

I want to allow users to input any normal character a-z or any numbers as well as underscores and any special character that resembles a letter such as é, ê, ô or ÿ etc

So far I have the following which doesn't allow for any of the special characters that I want to allow users to use:
  1. /[^\w\s]/g
The code i'm using is as follows:
  1. function checkName (strng) {
  2. var error = "";
  3.  
  4. var illegalChars = /[^\w\s]/g; // allow letters, numbers, and underscores
  5. if (strng == "") {
  6. error = "Please enter your name.\n";
  7. }
  8. else if((strng.length < 2)) {
  9. error = "The name is the wrong length.\n";
  10. }
  11. else if (illegalChars.test(strng)) {
  12. error = "The name contains illegal characters.\n";
  13. }
  14. return error;
  15. }
This user has a spatula. We don't know why, but we are afraid.
Reply With Quote Quick reply to this message  
Join Date: Sep 2009
Posts: 40
Reputation: d5e5 is an unknown quantity at this point 
Solved Threads: 6
d5e5 d5e5 is offline Offline
Light Poster
 
0
  #2
23 Days Ago
Try the following character class:
var illegalChars = /[\u0021-\u002f\u003a-\u0040\u005b-\u005e\u0060\u007b-\u007e]/g; // Don't allow any of these Javascript supports specifying unicode characters by hexadecimal expressions like \u0060 and ranges like \u007b-\u007e .

There is a fun website at http://hamstersoup.com/javascript/re...ss_tester.html that gives you the unicode expressions for any character class you specify.
Reply With Quote Quick reply to this message  
Join Date: Oct 2007
Posts: 260
Reputation: Venom Rush is an unknown quantity at this point 
Solved Threads: 2
Venom Rush's Avatar
Venom Rush Venom Rush is offline Offline
Posting Whiz in Training
 
0
  #3
21 Days Ago
Originally Posted by d5e5 View Post
Try the following character class:
var illegalChars = /[\u0021-\u002f\u003a-\u0040\u005b-\u005e\u0060\u007b-\u007e]/g; // Don't allow any of these Javascript supports specifying unicode characters by hexadecimal expressions like \u0060 and ranges like \u007b-\u007e .

There is a fun website at http://hamstersoup.com/javascript/re...ss_tester.html that gives you the unicode expressions for any character class you specify.
Thanks for that but there seems to be one problem...the 'illegal' characters are only detected if they are the first or last letter. If it's anywhere between valid characters it goes undetected. Is there any way I could change the code to fix this?
This user has a spatula. We don't know why, but we are afraid.
Reply With Quote Quick reply to this message  
Join Date: Sep 2009
Posts: 40
Reputation: d5e5 is an unknown quantity at this point 
Solved Threads: 6
d5e5 d5e5 is offline Offline
Light Poster
 
0
  #4
21 Days Ago
...the 'illegal' characters are only detected if they are the first or last letter. If it's anywhere between valid characters it goes undetected.
That's strange, I can't duplicate the problem. Here is the test script I'm using. Can you give me an example to put in the teststring variable that results in undetected illegal characters?
JavaScript / DHTML / AJAX Syntax (Toggle Plain Text)
  1. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
  2. "http://www.w3.org/TR/html4/strict.dtd"
  3. >
  4. <html lang="en">
  5. <head>
  6. <title>Test regex punctuation filter</title>
  7.  
  8. <script type="text/javascript">
  9. function checkName (strng) {
  10. var error = "";
  11. var illegalChars = /[\u0021-\u002f\u003a-\u0040\u005b-\u005e\u0060\u007b-\u007e]/g; // Don't allow any of these
  12. if (strng == "") {
  13. error = "Please enter your name.\n";
  14. }
  15. else if((strng.length < 2)) {
  16. error = "The name is the wrong length.\n";
  17. }
  18. else if (illegalChars.test(strng)) {
  19. error = "The name contains illegal characters.\n";
  20. }
  21. return error;
  22. }
  23.  
  24. // The following lines test the function with a string containing illegal chars.
  25. var teststring = "Here is a string with ill:egal ch@racters";
  26. var errmsg = checkName(teststring);
  27. //alert("Testing \'" + teststring + "\' results in \'" + errmsg + "\'");
  28. document.writeln("Testing <p><b>" + teststring + "</b><p> results in <p><b>" + errmsg);
  29. </script>
  30.  
  31. </head>
  32. <body>
  33.  
  34. </body>
  35. </html>
Reply With Quote Quick reply to this message  
Join Date: Sep 2009
Posts: 40
Reputation: d5e5 is an unknown quantity at this point 
Solved Threads: 6
d5e5 d5e5 is offline Offline
Light Poster

Simplified my test script

 
0
  #5
21 Days Ago
I simplified my test script. It still seems to find illegal characters wherever they are in the test string.
JavaScript / DHTML / AJAX Syntax (Toggle Plain Text)
  1. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
  2. "http://www.w3.org/TR/html4/strict.dtd"
  3. >
  4. <html lang="en">
  5. <head>
  6. <title>Test regex punctuation filter</title>
  7.  
  8. <script type="text/javascript">
  9. function checkString (strng) {
  10. var error = false;
  11. var illegalChars = /[\u0021-\u002f\u003a-\u0040\u005b-\u005e\u0060\u007b-\u007e]/g; // Don't allow any of these
  12. error = (illegalChars.test(strng));
  13. return error;
  14. }
  15.  
  16. // The following lines test the function with a string containing illegal chars.
  17. var teststring = "Here is a str&ing with illegal characters";
  18. var errmsg = checkString(teststring);
  19. document.writeln("Testing \"<b>" + teststring + "</b>\" results in <b>" + errmsg + "</b><p>");
  20. var teststring = "Here is a string without any illegal characters";
  21. var errmsg = checkString(teststring);
  22. document.writeln("Testing \"<b>" + teststring + "</b>\" results in <b>" + errmsg + "</b><p>");
  23. </script>
  24.  
  25. </head>
  26. <body>
  27. </body>
  28. </html>
Reply With Quote Quick reply to this message  
Join Date: Oct 2007
Posts: 260
Reputation: Venom Rush is an unknown quantity at this point 
Solved Threads: 2
Venom Rush's Avatar
Venom Rush Venom Rush is offline Offline
Posting Whiz in Training
 
0
  #6
17 Days Ago
Originally Posted by d5e5 View Post
I simplified my test script. It still seems to find illegal characters wherever they are in the test string.
JavaScript / DHTML / AJAX Syntax (Toggle Plain Text)
  1. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
  2. "http://www.w3.org/TR/html4/strict.dtd"
  3. >
  4. <html lang="en">
  5. <head>
  6. <title>Test regex punctuation filter</title>
  7.  
  8. <script type="text/javascript">
  9. function checkString (strng) {
  10. var error = false;
  11. var illegalChars = /[\u0021-\u002f\u003a-\u0040\u005b-\u005e\u0060\u007b-\u007e]/g; // Don't allow any of these
  12. error = (illegalChars.test(strng));
  13. return error;
  14. }
  15.  
  16. // The following lines test the function with a string containing illegal chars.
  17. var teststring = "Here is a str&ing with illegal characters";
  18. var errmsg = checkString(teststring);
  19. document.writeln("Testing \"<b>" + teststring + "</b>\" results in <b>" + errmsg + "</b><p>");
  20. var teststring = "Here is a string without any illegal characters";
  21. var errmsg = checkString(teststring);
  22. document.writeln("Testing \"<b>" + teststring + "</b>\" results in <b>" + errmsg + "</b><p>");
  23. </script>
  24.  
  25. </head>
  26. <body>
  27. </body>
  28. </html>
Strange, try using one word in your test with an illegal character in the middle of the word. That's what I've been testing and it only detects the illegal character at the start or end of the word
This user has a spatula. We don't know why, but we are afraid.
Reply With Quote Quick reply to this message  
Join Date: Sep 2009
Posts: 40
Reputation: d5e5 is an unknown quantity at this point 
Solved Threads: 6
d5e5 d5e5 is offline Offline
Light Poster

Remove the /g modifier

 
1
  #7
17 Days Ago
In that regex for illegal characters I should not have put the \g modifier at the end. As soon as we find the first illegal character that is all we need to know. The /g global modifier attempts to match all illegal characters in the string starting at the position where it previously found a match. All we need to know is whether there is at least one illegal character in the string so we don't need /g and it is somehow giving us inconsistent results.

In your script, try replacing the regex like this:
JavaScript / DHTML / AJAX Syntax (Toggle Plain Text)
  1. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
  2. "http://www.w3.org/TR/html4/strict.dtd"
  3. >
  4. <html lang="en">
  5. <head>
  6. <title>Test regex punctuation filter</title>
  7.  
  8. <script type="text/javascript">
  9. function checkString (strng) {
  10. var error = false;
  11. //var illegalChars = /[\u0021-\u002f\u003a-\u0040\u005b-\u005e\u0060\u007b-\u007e]/g; // The /g (for global) is a goof.
  12. var illegalChars = /[\u0021-\u002f\u003a-\u0040\u005b-\u005e\u0060\u007b-\u007e]/; // NOT global.
  13. error = (illegalChars.test(strng));
  14. return error;
  15. }
  16.  
  17. // The following lines test the function with a string containing illegal chars.
  18. var teststring = "Java$cript"; //Illegal character. Test should return 'true'
  19. var errmsg = checkString(teststring);
  20. document.writeln("Testing \"<b>" + teststring + "</b>\" results in <b>" + errmsg + "</b><p>");
  21. var teststring = "Javascript"; //No illegal character. Test should return 'false'
  22. var errmsg = checkString(teststring);
  23. document.writeln("Testing \"<b>" + teststring + "</b>\" results in <b>" + errmsg + "</b><p>");
  24. </script>
  25.  
  26. </head>
  27. <body>
  28.  
  29. </body>
  30. </html>
Reply With Quote Quick reply to this message  
Join Date: Oct 2007
Posts: 260
Reputation: Venom Rush is an unknown quantity at this point 
Solved Threads: 2
Venom Rush's Avatar
Venom Rush Venom Rush is offline Offline
Posting Whiz in Training
 
0
  #8
16 Days Ago
Perfect. Thanks for that d5e5. Works exactly the way I want now.
This user has a spatula. We don't know why, but we are afraid.
Reply With Quote Quick reply to this message  
Reply

Tags
javascript, regex, unicode

This thread has been marked solved.
Perhaps start a new thread instead?
Message:


Thread Tools Search this Thread



About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC