Validate email address form

15 Years Ago TrustyTony 0 2K Views

Inspired by discussion in
http://www.daniweb.com/tutorials/tutorial238544.html
(test cases are from there)

I used my magic pattern matching function transformed from numbers and adding the parts length checking loop.

I like my own solution better, I do not know about you.

def validateEmail(a):
    sep=[x for x in a if not x.isalpha()]
    sepjoined=''.join(sep)
    ## sep joined must be ..@.... form
    if sepjoined.strip('.') != '@': return False
    end=a
    for i in sep:
        part,i,end=end.partition(i)
        if len(part)<2: return False
    return True

if __name__ == '__main__':
    email1 = "test.@web.com"
    print email1,"is valid:",validateEmail(email1)
    email2 = "test+john@web.museum"
    print email2,"is valid:",validateEmail(email2)
    email3 = "test+john@web.m"
    print email3,"is valid:",validateEmail(email3)
    email4 = "a@n.dk"
    print email4,"is valid:",validateEmail(email4)
    email5 = "and.bun@webben.de"
    print email5,"is valid:",validateEmail(email5)

TrustyTony 888 ex-Moderator

15 Years Ago

Of cause I looked a moment these looping without loop tests and could leave them as original:

# -*- coding: latin1 -*-
def validateEmail(a):
    sep=[x for x in a if not x.isalpha()]
    sepjoined=''.join(sep)
    ## sep joined must be ..@.... form
    if sepjoined.strip('.') != '@': return False
    end=a
    for i in sep:
        part,i,end=end.partition(i)
        if len(part)<2: return False
    return True

if __name__ == '__main__':
    emails = [ "test.@web.com","test+john@web.museum", "test+john@web.m",
               "a@n.dk", "and.bun@webben.de","marjaliisa.hamalainen@hel.fi",
               "marja-liisa.hämäläinen@hel.fi", "marjaliisa.hämäläinen@hel.fi"]
    print "Valid emails are:"
    for i in filter(validateEmail,emails): print '\t',i
    print "Non-ascii letters are nowadays allowed also in names!"
""" Output:
Valid emails are:
	and.bun@webben.de
	marjaliisa.hamalainen@hel.fi
	marjaliisa.hämäläinen@hel.fi
Non-ascii letters are nowadays allowed also in names!
"""

Edited 15 Years Ago by TrustyTony because: n/a

ultimatebuster 14 Posting Whiz in Training

15 Years Ago

Why not regex?

def validateEmail(address):
	pattern = "[\.\w]{2,}[@]\w+[.]\w+"
	if re.match(pattern, address):
		return True
	else:
		return False

The problem is that it won't detect test.@test.com as a false one. If anyone has an idea. Please suggest it.

TrustyTony 888 ex-Moderator

15 Years Ago

OK, my code has bug, it does not check the last part which is left in end variable before returning from function for valid length. Luckily, because I reused the logic for other check, I noticed the bug in debuging it, Here correction:

# -*- coding: latin1 -*-
import re
def validateEmail(a):
    sep=[x for x in a if not x.isalpha()]
    sepjoined=''.join(sep)
    ## sep joined must be ..@.... form
    if sepjoined.strip('.') != '@': return False
    end=a
    for i in sep:
        part,i,end=end.partition(i)
        if len(part)<2: return False
    return len(end)>1

def emailval(address):
	pattern = "[\.\w]{2,}[@]\w+[.]\w+"
	if re.match(pattern, address):
		return True
	else:
		return False    

if __name__ == '__main__':
    emails = [ "test.@web.com","test+john@web.museum", "test+john@web.m",
               "a@n.dk", "and.bun@webben.de","marjaliisa.hämäläinen@hel.fi",
               "marja-liisa.hämäläinen@hel.fi", "marjaliisah@hel.",'tony@localhost']
    print "Valid emails are:"
    for i in filter(validateEmail,emails): print '\t',i
    print "Regexp gives wrong answer:"
    for i in filter(emailval,emails):  print '\t',i
"""
Valid emails are:
	and.bun@webben.de
	marjaliisa.hämäläinen@hel.fi
	tony@localhost
Regexp gives wrong answer:
	test.@web.com
	and.bun@webben.de
"""

Here also confirmation that the regexp posted is even more wrong than previous check.

Edited 15 Years Ago by TrustyTony because: Title was missing

TrustyTony 888 ex-Moderator

13 Years Ago

Update with better style than this newbie did and changing the match to be similar to this better regular expression, which is little restricted version from
http://www.regular-expressions.info/email.html (no test.@web.com accepted)

Here you see some examples that standard would pass:
http://en.wikipedia.org/wiki/Email_address#Valid_email_addresses

# -*- coding: latin1 -*-
import re

def validate_email(address):
    """ Validate by python equivalent to regular expression below """
    #to not allow single letter parts increase len_limit to 2 or more
    len_limit, max_domain = 1, 4
    # only ascii values not all alpha
    sep = [code for code in address if not code.isalpha() or ord(code) > 128]
    if (# sep joined must be ..@.... form
        ''.join(sep).strip('.') != '@' or
        # must have point after @
        sep[-1] == '@'):
        return False
    else:
        end = address
        for s in sep:
            part, s, end = end.partition(s)
            if len(part) < len_limit:
                return False
            
    return max_domain >= len(end)>1

def email_validate_re(address):
    """ from http://www.regular-expressions.info/email.html """
    pattern = r"\b[a-zA-Z0-9._%+-]*[a-zA-Z0-9_%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b"
    return re.match(pattern, address)

if __name__ == '__main__':
    emails = [ "test.@web.com","test@com", "test+john@web.museum", "test+john@web.m",
               "address@n.dk", "and.bun@webben.de","marjaliisa.hamalainen@hel.fi",
               "marja-liisa.hämäläinen@hel.fi", "marjaliisah@hel.",'tony@veijalainen.localhost']
    print("Valid emails by my function are:")
    print('\t' + '\n\t'.join(email for email in emails if validate_email(email)))
    print("\nRegexp answer:")
    print('\t' + '\n\t'.join(email for email in emails if email_validate_re(email)))

"""
Valid emails by my function are:
	address@n.dk
	and.bun@webben.de
	marjaliisa.hamalainen@hel.fi

Regexp answer:
	address@n.dk
	and.bun@webben.de
	marjaliisa.hamalainen@hel.fi
"""

Edited 13 Years Ago by TrustyTony because: n/a

TrustyTony 888 ex-Moderator

13 Years Ago

Here you can see those email addresses that standard would allow, but you have never seen:
http://en.wikipedia.org/wiki/Email_address

richieking 44 Master Poster

13 Years Ago

Here you see some examples that standard would pass:
http://en.wikipedia.org/wiki/Email_address#Valid_email_addresses

# -*- coding: latin1 -*-
import re

def validate_email(address):
    """ Validate by python equivalent to regular expression below """
    #to not allow single letter parts increase len_limit to 2 or more
    len_limit, max_domain = 1, 4
    # only ascii values not all alpha
    sep = [code for code in address if not code.isalpha() or ord(code) > 128]
    if (# sep joined must be ..@.... form
        ''.join(sep).strip('.') != '@' or
        # must have point after @
        sep[-1] == '@'):
        return False
    else:
        end = address
        for s in sep:
            part, s, end = end.partition(s)
            if len(part) < len_limit:
                return False
            
    return max_domain >= len(end)>1

def email_validate_re(address):
    """ from http://www.regular-expressions.info/email.html """
    pattern = r"\b[a-zA-Z0-9._%+-]*[a-zA-Z0-9_%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b"
    return re.match(pattern, address)

if __name__ == '__main__':
    emails = [ "test.@web.com","test@com", "test+john@web.museum", "test+john@web.m",
               "address@n.dk", "and.bun@webben.de","marjaliisa.hamalainen@hel.fi",
               "marja-liisa.hämäläinen@hel.fi", "marjaliisah@hel.",'tony@veijalainen.localhost']
    print("Valid emails by my function are:")
    print('\t' + '\n\t'.join(email for email in emails if validate_email(email)))
    print("\nRegexp answer:")
    print('\t' + '\n\t'.join(email for email in emails if email_validate_re(email)))

"""
Valid emails by my function are:
	address@n.dk
	and.bun@webben.de
	marjaliisa.hamalainen@hel.fi

Regexp answer:
	address@n.dk
	and.bun@webben.de
	marjaliisa.hamalainen@hel.fi
"""

Though i like your function pytony, there are alot of emails your function may reject which will be very alarmin. consider this .....

# -*- coding: latin1 -*-
import re

def validate_email(address):
    """ Validate by python equivalent to regular expression below """
    #to not allow single letter parts increase len_limit to 2 or more
    len_limit, max_domain = 1, 4
    # only ascii values not all alpha
    sep = [code for code in address if not code.isalpha() or ord(code) > 128]
    if (# sep joined must be ..@.... form
        ''.join(sep).strip('.') != '@' or
        # must have point after @
        sep[-1] == '@'):
        return False
    else:
        end = address
        for s in sep:
            part, s, end = end.partition(s)
            if len(part) < len_limit:
                return False
            
    return max_domain >= len(end)>1

def email_validate_re(address):
    """ from http://www.regular-expressions.info/email.html """
    pattern = r"\b[a-zA-Z0-9._%+-]*[a-zA-Z0-9_%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b"
    return re.match(pattern, address)

if __name__ == '__main__':
    emails = [ "test.@web.com","test@com", "test+john@web.museum", "test+john@web.m",
               "address@n.dk", "and.bun@webben.de","marjaliisa.hamalainen@hel.fi",
               "marja-liisa.hÃ¤mÃ¤lÃ¤inen@hel.fi", "marjaliisah@hel.",'tony@veijalainen.localhost','richie_@ya.com']
    print("Valid emails by my function are:")
    print('\t' + '\n\t'.join(email for email in emails if validate_email(email)))
    print("\nRegexp answer:")
    print('\t' + '\n\t'.join(email for email in emails if email_validate_re(email)))




Valid emails by my function are:
	address@n.dk
	and.bun@webben.de
	marjaliisa.hamalainen@hel.fi

Regexp answer:
	address@n.dk
	and.bun@webben.de
	marjaliisa.hamalainen@hel.fi
	richie_@ya.com

at the end of the list for email. i added a well accepted email but your function rejected it. [ richie_@ya.com ].
But nice move anyway.

TrustyTony 888 ex-Moderator

13 Years Ago

You must then relax the limitation:

def validate_email(address):
    """ Validate by python equivalent to regular expression below """
    # to not allow single letter parts increase len_limit to 2 or more
    len_limit, max_domain = 1, 4
    # acceptable in left side in username
    accept_username = '_-'
    # only ascii values not all alpha
    sep = [code for code in address if ((not code.isalpha() and code not in accept_username)
                                        or ord(code) > 128)]
    if (# sep joined must be ..@.... form
        ''.join(sep).strip('.') != '@' or
        # must have point after @
        sep[-1].strip() == '@'):
        return False
    else:
        end = address
        for s in sep:
            part, s, end = end.partition(s)
            if len(part) < len_limit:
                return False
            
    return max_domain >= len(end) > 1

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.