We're a community of 1.1M IT Pros here for help, advice, solutions, professional growth and fun. Join us!
1,080,629 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Start New Discussion Reply to this Discussion

Regular Expression for Email

Here is a weird regular expression for emails .

We can have various kind of email addresses

string1@somemail.com
string1@somemail.co.in
string1.string2@somemail.com
string1.string2@somemail.co.in

The following regular expression can find any of the mails above

email2="santa.banta@gmail.co.in"
email1="arindam31@yahoo.co.in'"
email="bogusemail123@sillymail.com"
email3="santa.banta.manta@gmail.co.in"
email4="santa.banta.manta@gmail.co.in.xv.fg.gh"
email5="abc.dcf@ghj.org"
email6="santa.banta.manta@gmail.co.in.org"

re.search('\w+[.|\w]\w+@\w+[.]\w+[.|\w+]\w+',email)

>>> x=re.search('\w+[.|\w]\w+@\w+[.]\w+[.|\w+]\w+',email2)
>>> x.group()
'santa.banta@gmail.co.in'
>>> x=re.search('\w+[.|\w]\w+@\w+[.]\w+[.|\w+]\w+',email1)
>>> x.group()
'arindam31@yahoo.co.in'
>>> x=re.search('\w+[.|\w]\w+@\w+[.]\w+[.|\w+]\w+',email)
>>> x.group()
'bogusemail123@sillymail.com'

This seems too complicated right...

I generalized it a bit....

>>> x=re.search('(\w+[.|\w])*@(\w+[.])*\w+',email4)
>>> x.group()
'santa.banta.manta@gmail.co.in.xv.fg.gh'

The above regular expression now can detect any type of combination...

Now if you want only email address ending with '.in' or '.com'
then you can add a variation...

>>> x=re.search('(\w+[.|\w])*@(\w+[.])*(com$|in$)',email)

You can try out this on various combinations....
If the expression does not fit anywhere , do tell me .

Some assumptions I have used : email address(username) wont contain special characters , only words or numbers.

5
Contributors
9
Replies
3 Days
Discussion Span
1 Year Ago
Last Updated
10
Views
Question
Answered
arindam31
Light Poster
48 posts since Mar 2011
Reputation Points: 2
Solved Threads: 0
Skill Endorsements: 0
Question Self-Answered as of 1 Year Ago

Good stuff!

thines01
Postaholic
Team Colleague
2,433 posts since Oct 2009
Reputation Points: 447
Solved Threads: 408
Skill Endorsements: 7

Good stuff!

Thank you Sir

arindam31
Light Poster
48 posts since Mar 2011
Reputation Points: 2
Solved Threads: 0
Skill Endorsements: 0

Thank you Sir

When you have a useful piece of code and no issue, write a code snippet instead of a regular thread. You can choose this in the 'title' section when you start a new thread.

Gribouillis
Posting Maven
Moderator
3,101 posts since Jul 2008
Reputation Points: 1,130
Solved Threads: 761
Skill Endorsements: 11

Just some points,always use raw string r' ' with regex.
For email validation re.match(),for extracting re.search, re.findall().

There is no simple soultion for this,diffrent RFC standard set what is a valid email.
This has been up in several forum,here is a good one you can read.
http://stackoverflow.com/questions/201323/what-is-the-best-regular-expression-for-validating-email-addresses

Just a quick test with a invalid email address.

>>> e = 'Abc.@example.com'
>>> if re.match(r'(\w+[.|\w])*@(\w+[.])*\w+', e):
...     print 'Successful match'
... else:    
...     print 'Match attempt failed'
...     
Successful match

Good effort this is not so much criticism,i think you as many stumple into this.
So a regex that validate email addresses accordingh to RFC 822,look like this.
http://ex-parrot.com/~pdw/Mail-RFC822-Address.html

snippsat
Posting Shark
959 posts since Aug 2008
Reputation Points: 482
Solved Threads: 346
Skill Endorsements: 8

So for some reason I prefer to use Python instead of re language for this. Actually one of my first code snippets here was of same subject. www.daniweb.com/software-development/python/code/280071/1209215#post1209215

pyTony
pyMod
Moderator
6,330 posts since Apr 2010
Reputation Points: 879
Solved Threads: 989
Skill Endorsements: 27

When you have a useful piece of code and no issue, write a code snippet instead of a regular thread. You can choose this in the 'title' section when you start a new thread.

Il keep this in mind sir. Thanks for the suggestion

arindam31
Light Poster
48 posts since Mar 2011
Reputation Points: 2
Solved Threads: 0
Skill Endorsements: 0

Just some points,always use raw string r' ' with regex.
For email validation re.match(),for extracting re.search, re.findall().

There is no simple soultion for this,diffrent RFC standard set what is a valid email.
This has been up in several forum,here is a good one you can read.
http://stackoverflow.com/questions/201323/what-is-the-best-regular-expression-for-validating-email-addresses

Just a quick test with a invalid email address.

>>> e = 'Abc.@example.com'
>>> if re.match(r'(\w+[.|\w])*@(\w+[.])*\w+', e):
...     print 'Successful match'
... else:    
...     print 'Match attempt failed'
...     
Successful match

Good effort this is not so much criticism,i think you as many stumple into this.
So a regex that validate email addresses accordingh to RFC 822,look like this.
http://ex-parrot.com/~pdw/Mail-RFC822-Address.html

Good point sir........This is very usefull tip..........
While finding things........re.search or re.findall
While matching things.......re.match......

Il keep this in mind

arindam31
Light Poster
48 posts since Mar 2011
Reputation Points: 2
Solved Threads: 0
Skill Endorsements: 0

Just some points,always use raw string r' ' with regex.
For email validation re.match(),for extracting re.search, re.findall().

There is no simple soultion for this,diffrent RFC standard set what is a valid email.
This has been up in several forum,here is a good one you can read.
http://stackoverflow.com/questions/201323/what-is-the-best-regular-expression-for-validating-email-addresses

Just a quick test with a invalid email address.

>>> e = 'Abc.@example.com'
>>> if re.match(r'(\w+[.|\w])*@(\w+[.])*\w+', e):
...     print 'Successful match'
... else:    
...     print 'Match attempt failed'
...     
Successful match

Good effort this is not so much criticism,i think you as many stumple into this.
So a regex that validate email addresses accordingh to RFC 822,look like this.
http://ex-parrot.com/~pdw/Mail-RFC822-Address.html

Wow.....never saw thought of this bug in the code..........Very well caught sir.
I will try to play with my code and see if it work around the bug.......

And i visited the link u provided..........man that was a spiderweb......What is that seriously?Too too complex..........

arindam31
Light Poster
48 posts since Mar 2011
Reputation Points: 2
Solved Threads: 0
Skill Endorsements: 0

More useful regular expression is explained in http://www.regular-expressions.info/email.html

r"\b[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b"

I added lower case so does not need to be used with flag ignore the case. The page has also standards confirming regular expression, but most features of standard are not practically needed or helpful.

pyTony
pyMod
Moderator
6,330 posts since Apr 2010
Reputation Points: 879
Solved Threads: 989
Skill Endorsements: 27

This question has already been solved: Start a new discussion instead

Post: Markdown Syntax: Formatting Help
 
You
View similar articles that have also been tagged:
 
© 2013 DaniWeb® LLC
Page generated in 0.0883 seconds using 2.71MB