Inspired by discussion in
http://www.daniweb.com/tutorials/tutorial238544.html
(test cases are from there)

I used my magic pattern matching function transformed from numbers and adding the parts length checking loop.

I like my own solution better, I do not know about you.

1,987 Views

IT Pro doing Eng-Fin-Eng translations

``````def validateEmail(a):
sep=[x for x in a if not x.isalpha()]
sepjoined=''.join(sep)
## sep joined must be ..@.... form
if sepjoined.strip('.') != '@': return False
end=a
for i in sep:
part,i,end=end.partition(i)
if len(part)<2: return False
return True

if __name__ == '__main__':
email1 = "test.@web.com"
print email1,"is valid:",validateEmail(email1)
email2 = "test+john@web.museum"
print email2,"is valid:",validateEmail(email2)
email3 = "test+john@web.m"
print email3,"is valid:",validateEmail(email3)
email4 = "a@n.dk"
print email4,"is valid:",validateEmail(email4)
email5 = "and.bun@webben.de"
print email5,"is valid:",validateEmail(email5)``````
TrustyTony 888

Of cause I looked a moment these looping without loop tests and could leave them as original:

``````# -*- coding: latin1 -*-
def validateEmail(a):
sep=[x for x in a if not x.isalpha()]
sepjoined=''.join(sep)
## sep joined must be ..@.... form
if sepjoined.strip('.') != '@': return False
end=a
for i in sep:
part,i,end=end.partition(i)
if len(part)<2: return False
return True

if __name__ == '__main__':
emails = [ "test.@web.com","test+john@web.museum", "test+john@web.m",
"a@n.dk", "and.bun@webben.de","marjaliisa.hamalainen@hel.fi",
"marja-liisa.hämäläinen@hel.fi", "marjaliisa.hämäläinen@hel.fi"]
print "Valid emails are:"
for i in filter(validateEmail,emails): print '\t',i
print "Non-ascii letters are nowadays allowed also in names!"
""" Output:
Valid emails are:
and.bun@webben.de
marjaliisa.hamalainen@hel.fi
marjaliisa.hämäläinen@hel.fi
Non-ascii letters are nowadays allowed also in names!
"""``````

Why not regex?

``````def validateEmail(address):
pattern = "[\.\w]{2,}[@]\w+[.]\w+"
return True
else:
return False``````

The problem is that it won't detect test.@test.com as a false one. If anyone has an idea. Please suggest it.

TrustyTony 888

OK, my code has bug, it does not check the last part which is left in end variable before returning from function for valid length. Luckily, because I reused the logic for other check, I noticed the bug in debuging it, Here correction:

``````# -*- coding: latin1 -*-
import re
def validateEmail(a):
sep=[x for x in a if not x.isalpha()]
sepjoined=''.join(sep)
## sep joined must be ..@.... form
if sepjoined.strip('.') != '@': return False
end=a
for i in sep:
part,i,end=end.partition(i)
if len(part)<2: return False
return len(end)>1

pattern = "[\.\w]{2,}[@]\w+[.]\w+"
return True
else:
return False

if __name__ == '__main__':
emails = [ "test.@web.com","test+john@web.museum", "test+john@web.m",
"a@n.dk", "and.bun@webben.de","marjaliisa.hämäläinen@hel.fi",
"marja-liisa.hämäläinen@hel.fi", "marjaliisah@hel.",'tony@localhost']
print "Valid emails are:"
for i in filter(validateEmail,emails): print '\t',i
for i in filter(emailval,emails):  print '\t',i
"""
Valid emails are:
and.bun@webben.de
marjaliisa.hämäläinen@hel.fi
tony@localhost
test.@web.com
and.bun@webben.de
"""``````

Here also confirmation that the regexp posted is even more wrong than previous check.

TrustyTony 888

Update with better style than this newbie did and changing the match to be similar to this better regular expression, which is little restricted version from
http://www.regular-expressions.info/email.html (no test.@web.com accepted)

Here you see some examples that standard would pass:

``````# -*- coding: latin1 -*-
import re

""" Validate by python equivalent to regular expression below """
#to not allow single letter parts increase len_limit to 2 or more
len_limit, max_domain = 1, 4
# only ascii values not all alpha
sep = [code for code in address if not code.isalpha() or ord(code) > 128]
if (# sep joined must be ..@.... form
''.join(sep).strip('.') != '@' or
# must have point after @
sep[-1] == '@'):
return False
else:
for s in sep:
part, s, end = end.partition(s)
if len(part) < len_limit:
return False

return max_domain >= len(end)>1

""" from http://www.regular-expressions.info/email.html """
pattern = r"\b[a-zA-Z0-9._%+-]*[a-zA-Z0-9_%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b"

if __name__ == '__main__':
emails = [ "test.@web.com","test@com", "test+john@web.museum", "test+john@web.m",
"marja-liisa.hämäläinen@hel.fi", "marjaliisah@hel.",'tony@veijalainen.localhost']
print("Valid emails by my function are:")
print('\t' + '\n\t'.join(email for email in emails if validate_email(email)))
print('\t' + '\n\t'.join(email for email in emails if email_validate_re(email)))

"""
Valid emails by my function are:
and.bun@webben.de
marjaliisa.hamalainen@hel.fi

and.bun@webben.de
marjaliisa.hamalainen@hel.fi
"""``````
TrustyTony 888

Here you can see those email addresses that standard would allow, but you have never seen:

Update with better style than this newbie did and changing the match to be similar to this better regular expression, which is little restricted version from
http://www.regular-expressions.info/email.html (no test.@web.com accepted)

Here you see some examples that standard would pass:

``````# -*- coding: latin1 -*-
import re

""" Validate by python equivalent to regular expression below """
#to not allow single letter parts increase len_limit to 2 or more
len_limit, max_domain = 1, 4
# only ascii values not all alpha
sep = [code for code in address if not code.isalpha() or ord(code) > 128]
if (# sep joined must be ..@.... form
''.join(sep).strip('.') != '@' or
# must have point after @
sep[-1] == '@'):
return False
else:
for s in sep:
part, s, end = end.partition(s)
if len(part) < len_limit:
return False

return max_domain >= len(end)>1

""" from http://www.regular-expressions.info/email.html """
pattern = r"\b[a-zA-Z0-9._%+-]*[a-zA-Z0-9_%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b"

if __name__ == '__main__':
emails = [ "test.@web.com","test@com", "test+john@web.museum", "test+john@web.m",
"marja-liisa.hämäläinen@hel.fi", "marjaliisah@hel.",'tony@veijalainen.localhost']
print("Valid emails by my function are:")
print('\t' + '\n\t'.join(email for email in emails if validate_email(email)))
print('\t' + '\n\t'.join(email for email in emails if email_validate_re(email)))

"""
Valid emails by my function are:
and.bun@webben.de
marjaliisa.hamalainen@hel.fi

and.bun@webben.de
marjaliisa.hamalainen@hel.fi
"""``````

Though i like your function pytony, there are alot of emails your function may reject which will be very alarmin. consider this .....

``````# -*- coding: latin1 -*-
import re

""" Validate by python equivalent to regular expression below """
#to not allow single letter parts increase len_limit to 2 or more
len_limit, max_domain = 1, 4
# only ascii values not all alpha
sep = [code for code in address if not code.isalpha() or ord(code) > 128]
if (# sep joined must be ..@.... form
''.join(sep).strip('.') != '@' or
# must have point after @
sep[-1] == '@'):
return False
else:
for s in sep:
part, s, end = end.partition(s)
if len(part) < len_limit:
return False

return max_domain >= len(end)>1

""" from http://www.regular-expressions.info/email.html """
pattern = r"\b[a-zA-Z0-9._%+-]*[a-zA-Z0-9_%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b"

if __name__ == '__main__':
emails = [ "test.@web.com","test@com", "test+john@web.museum", "test+john@web.m",
"marja-liisa.hÃ¤mÃ¤lÃ¤inen@hel.fi", "marjaliisah@hel.",'tony@veijalainen.localhost','richie_@ya.com']
print("Valid emails by my function are:")
print('\t' + '\n\t'.join(email for email in emails if validate_email(email)))
print('\t' + '\n\t'.join(email for email in emails if email_validate_re(email)))

Valid emails by my function are:
and.bun@webben.de
marjaliisa.hamalainen@hel.fi

and.bun@webben.de
marjaliisa.hamalainen@hel.fi
richie_@ya.com``````

at the end of the list for email. i added a well accepted email but your function rejected it. [ richie_@ya.com ].
But nice move anyway.

TrustyTony 888

You must then relax the limitation:

``````def validate_email(address):
""" Validate by python equivalent to regular expression below """
# to not allow single letter parts increase len_limit to 2 or more
len_limit, max_domain = 1, 4
# acceptable in left side in username
# only ascii values not all alpha
sep = [code for code in address if ((not code.isalpha() and code not in accept_username)
or ord(code) > 128)]
if (# sep joined must be ..@.... form
''.join(sep).strip('.') != '@' or
# must have point after @
sep[-1].strip() == '@'):
return False
else: