1,105,391 Community Members

Validate email address form

Member Avatar
Reputation Points: 818 [?]
Q&As Helped to Solve: 1,056 [?]
Skill Endorsements: 42 [?]
 
0
 

Inspired by discussion in
http://www.daniweb.com/tutorials/tutorial238544.html
(test cases are from there)

I used my magic pattern matching function transformed from numbers and adding the parts length checking loop.

I like my own solution better, I do not know about you.

def validateEmail(a):
    sep=[x for x in a if not x.isalpha()]
    sepjoined=''.join(sep)
    ## sep joined must be ..@.... form
    if sepjoined.strip('.') != '@': return False
    end=a
    for i in sep:
        part,i,end=end.partition(i)
        if len(part)<2: return False
    return True

if __name__ == '__main__':
    email1 = "test.@web.com"
    print email1,"is valid:",validateEmail(email1)
    email2 = "test+john@web.museum"
    print email2,"is valid:",validateEmail(email2)
    email3 = "test+john@web.m"
    print email3,"is valid:",validateEmail(email3)
    email4 = "a@n.dk"
    print email4,"is valid:",validateEmail(email4)
    email5 = "and.bun@webben.de"
    print email5,"is valid:",validateEmail(email5)
Member Avatar
Tony Veijalainen

Specialties:
IT/Science/Contracts/Religious translation/interpreting FIN-ENG-FIN
Python programming

Member Avatar
pyTony
pyMod
6,103 posts since Apr 2010
Reputation Points: 818 [?]
Q&As Helped to Solve: 1,056 [?]
Skill Endorsements: 42 [?]
Moderator
Featured
 
0
 

Of cause I looked a moment these looping without loop tests and could leave them as original:

# -*- coding: latin1 -*-
def validateEmail(a):
    sep=[x for x in a if not x.isalpha()]
    sepjoined=''.join(sep)
    ## sep joined must be ..@.... form
    if sepjoined.strip('.') != '@': return False
    end=a
    for i in sep:
        part,i,end=end.partition(i)
        if len(part)<2: return False
    return True

if __name__ == '__main__':
    emails = [ "test.@web.com","test+john@web.museum", "test+john@web.m",
               "a@n.dk", "and.bun@webben.de","marjaliisa.hamalainen@hel.fi",
               "marja-liisa.hämäläinen@hel.fi", "marjaliisa.hämäläinen@hel.fi"]
    print "Valid emails are:"
    for i in filter(validateEmail,emails): print '\t',i
    print "Non-ascii letters are nowadays allowed also in names!"
""" Output:
Valid emails are:
	and.bun@webben.de
	marjaliisa.hamalainen@hel.fi
	marjaliisa.hämäläinen@hel.fi
Non-ascii letters are nowadays allowed also in names!
"""
Member Avatar
ultimatebuster
Posting Whiz in Training
250 posts since Mar 2010
Reputation Points: 14 [?]
Q&As Helped to Solve: 69 [?]
Skill Endorsements: 0 [?]
 
0
 

Why not regex?

def validateEmail(address):
	pattern = "[\.\w]{2,}[@]\w+[.]\w+"
	if re.match(pattern, address):
		return True
	else:
		return False

The problem is that it won't detect test.@test.com as a false one. If anyone has an idea. Please suggest it.

Member Avatar
pyTony
pyMod
6,103 posts since Apr 2010
Reputation Points: 818 [?]
Q&As Helped to Solve: 1,056 [?]
Skill Endorsements: 42 [?]
Moderator
Featured
 
0
 

OK, my code has bug, it does not check the last part which is left in end variable before returning from function for valid length. Luckily, because I reused the logic for other check, I noticed the bug in debuging it, Here correction:

# -*- coding: latin1 -*-
import re
def validateEmail(a):
    sep=[x for x in a if not x.isalpha()]
    sepjoined=''.join(sep)
    ## sep joined must be ..@.... form
    if sepjoined.strip('.') != '@': return False
    end=a
    for i in sep:
        part,i,end=end.partition(i)
        if len(part)<2: return False
    return len(end)>1

def emailval(address):
	pattern = "[\.\w]{2,}[@]\w+[.]\w+"
	if re.match(pattern, address):
		return True
	else:
		return False    

if __name__ == '__main__':
    emails = [ "test.@web.com","test+john@web.museum", "test+john@web.m",
               "a@n.dk", "and.bun@webben.de","marjaliisa.hämäläinen@hel.fi",
               "marja-liisa.hämäläinen@hel.fi", "marjaliisah@hel.",'tony@localhost']
    print "Valid emails are:"
    for i in filter(validateEmail,emails): print '\t',i
    print "Regexp gives wrong answer:"
    for i in filter(emailval,emails):  print '\t',i
"""
Valid emails are:
	and.bun@webben.de
	marjaliisa.hämäläinen@hel.fi
	tony@localhost
Regexp gives wrong answer:
	test.@web.com
	and.bun@webben.de
"""

Here also confirmation that the regexp posted is even more wrong than previous check.

Member Avatar
pyTony
pyMod
6,103 posts since Apr 2010
Reputation Points: 818 [?]
Q&As Helped to Solve: 1,056 [?]
Skill Endorsements: 42 [?]
Moderator
Featured
 
0
 

Update with better style than this newbie did and changing the match to be similar to this better regular expression, which is little restricted version from
http://www.regular-expressions.info/email.html (no test.@web.com accepted)

Here you see some examples that standard would pass:
http://en.wikipedia.org/wiki/Email_address#Valid_email_addresses

# -*- coding: latin1 -*-
import re

def validate_email(address):
    """ Validate by python equivalent to regular expression below """
    #to not allow single letter parts increase len_limit to 2 or more
    len_limit, max_domain = 1, 4
    # only ascii values not all alpha
    sep = [code for code in address if not code.isalpha() or ord(code) > 128]
    if (# sep joined must be ..@.... form
        ''.join(sep).strip('.') != '@' or
        # must have point after @
        sep[-1] == '@'):
        return False
    else:
        end = address
        for s in sep:
            part, s, end = end.partition(s)
            if len(part) < len_limit:
                return False
            
    return max_domain >= len(end)>1

def email_validate_re(address):
    """ from http://www.regular-expressions.info/email.html """
    pattern = r"\b[a-zA-Z0-9._%+-]*[a-zA-Z0-9_%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b"
    return re.match(pattern, address)

if __name__ == '__main__':
    emails = [ "test.@web.com","test@com", "test+john@web.museum", "test+john@web.m",
               "address@n.dk", "and.bun@webben.de","marjaliisa.hamalainen@hel.fi",
               "marja-liisa.hämäläinen@hel.fi", "marjaliisah@hel.",'tony@veijalainen.localhost']
    print("Valid emails by my function are:")
    print('\t' + '\n\t'.join(email for email in emails if validate_email(email)))
    print("\nRegexp answer:")
    print('\t' + '\n\t'.join(email for email in emails if email_validate_re(email)))

"""
Valid emails by my function are:
	address@n.dk
	and.bun@webben.de
	marjaliisa.hamalainen@hel.fi

Regexp answer:
	address@n.dk
	and.bun@webben.de
	marjaliisa.hamalainen@hel.fi
"""
Member Avatar
pyTony
pyMod
6,103 posts since Apr 2010
Reputation Points: 818 [?]
Q&As Helped to Solve: 1,056 [?]
Skill Endorsements: 42 [?]
Moderator
Featured
 
0
 

Here you can see those email addresses that standard would allow, but you have never seen:
http://en.wikipedia.org/wiki/Email_address

Member Avatar
richieking
Posting Shark
926 posts since Jun 2009
Reputation Points: 40 [?]
Q&As Helped to Solve: 172 [?]
Skill Endorsements: 3 [?]
 
0
 

Update with better style than this newbie did and changing the match to be similar to this better regular expression, which is little restricted version from
http://www.regular-expressions.info/email.html (no test.@web.com accepted)

Here you see some examples that standard would pass:
http://en.wikipedia.org/wiki/Email_address#Valid_email_addresses

# -*- coding: latin1 -*-
import re

def validate_email(address):
    """ Validate by python equivalent to regular expression below """
    #to not allow single letter parts increase len_limit to 2 or more
    len_limit, max_domain = 1, 4
    # only ascii values not all alpha
    sep = [code for code in address if not code.isalpha() or ord(code) > 128]
    if (# sep joined must be ..@.... form
        ''.join(sep).strip('.') != '@' or
        # must have point after @
        sep[-1] == '@'):
        return False
    else:
        end = address
        for s in sep:
            part, s, end = end.partition(s)
            if len(part) < len_limit:
                return False
            
    return max_domain >= len(end)>1

def email_validate_re(address):
    """ from http://www.regular-expressions.info/email.html """
    pattern = r"\b[a-zA-Z0-9._%+-]*[a-zA-Z0-9_%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b"
    return re.match(pattern, address)

if __name__ == '__main__':
    emails = [ "test.@web.com","test@com", "test+john@web.museum", "test+john@web.m",
               "address@n.dk", "and.bun@webben.de","marjaliisa.hamalainen@hel.fi",
               "marja-liisa.hämäläinen@hel.fi", "marjaliisah@hel.",'tony@veijalainen.localhost']
    print("Valid emails by my function are:")
    print('\t' + '\n\t'.join(email for email in emails if validate_email(email)))
    print("\nRegexp answer:")
    print('\t' + '\n\t'.join(email for email in emails if email_validate_re(email)))

"""
Valid emails by my function are:
	address@n.dk
	and.bun@webben.de
	marjaliisa.hamalainen@hel.fi

Regexp answer:
	address@n.dk
	and.bun@webben.de
	marjaliisa.hamalainen@hel.fi
"""

Though i like your function pytony, there are alot of emails your function may reject which will be very alarmin. consider this .....

# -*- coding: latin1 -*-
import re

def validate_email(address):
    """ Validate by python equivalent to regular expression below """
    #to not allow single letter parts increase len_limit to 2 or more
    len_limit, max_domain = 1, 4
    # only ascii values not all alpha
    sep = [code for code in address if not code.isalpha() or ord(code) > 128]
    if (# sep joined must be ..@.... form
        ''.join(sep).strip('.') != '@' or
        # must have point after @
        sep[-1] == '@'):
        return False
    else:
        end = address
        for s in sep:
            part, s, end = end.partition(s)
            if len(part) < len_limit:
                return False
            
    return max_domain >= len(end)>1

def email_validate_re(address):
    """ from http://www.regular-expressions.info/email.html """
    pattern = r"\b[a-zA-Z0-9._%+-]*[a-zA-Z0-9_%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b"
    return re.match(pattern, address)

if __name__ == '__main__':
    emails = [ "test.@web.com","test@com", "test+john@web.museum", "test+john@web.m",
               "address@n.dk", "and.bun@webben.de","marjaliisa.hamalainen@hel.fi",
               "marja-liisa.hämäläinen@hel.fi", "marjaliisah@hel.",'tony@veijalainen.localhost','richie_@ya.com']
    print("Valid emails by my function are:")
    print('\t' + '\n\t'.join(email for email in emails if validate_email(email)))
    print("\nRegexp answer:")
    print('\t' + '\n\t'.join(email for email in emails if email_validate_re(email)))




Valid emails by my function are:
	address@n.dk
	and.bun@webben.de
	marjaliisa.hamalainen@hel.fi

Regexp answer:
	address@n.dk
	and.bun@webben.de
	marjaliisa.hamalainen@hel.fi
	richie_@ya.com

at the end of the list for email. i added a well accepted email but your function rejected it. [ richie_@ya.com ].
But nice move anyway.

Member Avatar
pyTony
pyMod
6,103 posts since Apr 2010
Reputation Points: 818 [?]
Q&As Helped to Solve: 1,056 [?]
Skill Endorsements: 42 [?]
Moderator
Featured
 
0
 

You must then relax the limitation:

def validate_email(address):
    """ Validate by python equivalent to regular expression below """
    # to not allow single letter parts increase len_limit to 2 or more
    len_limit, max_domain = 1, 4
    # acceptable in left side in username
    accept_username = '_-'
    # only ascii values not all alpha
    sep = [code for code in address if ((not code.isalpha() and code not in accept_username)
                                        or ord(code) > 128)]
    if (# sep joined must be ..@.... form
        ''.join(sep).strip('.') != '@' or
        # must have point after @
        sep[-1].strip() == '@'):
        return False
    else:
        end = address
        for s in sep:
            part, s, end = end.partition(s)
            if len(part) < len_limit:
                return False
            
    return max_domain >= len(end) > 1
You
Post:
Start New Discussion
View similar articles that have also been tagged: