using regex on phone numbers

Question

mysticstylez 0 Newbie Poster

14 Years Ago

Hi,

I have a text file that contains bunch of phone numbers. Some contain invalid characters and some are too long. How do i read this file into a list using regex. I only want valid phone numbers to populate the list, and skip the invalid ones.

this is what i have so far:

phoneNumbers = []
file2read = open('PhoneNumbers.txt', 'r')
for currentline in file2read:

    phoneNumbers.append(currentline.rstrip())
    
file2read.close()

python

5 Contributors
10 Replies
7K Views
1 Day Discussion Span
Latest Post 14 Years Ago Latest Post by TrustyTony

All 10 Replies

snippsat 661 Master Poster

14 Years Ago

What contry do you what valid phone nummers for?

Or give an example of match and not match.
match-
(+44)(0)20-12341234
02012341234

Not match.
(44+)020-12341234
1-555-5555

snippsat 661 Master Poster

14 Years Ago

I can write a solution tomorrow(away now),with regex.
Not so difficult to filter out numbers you need.

d5e5 109 Master Poster

14 Years Ago

#!/usr/bin/env python
import re
DataDir = '/home/david/Programming/Python'
phonefile = DataDir + '/' + 'PhoneNumbers.txt'

"""Read each line from file. If it starts with
a string of exactly 10 consecutive digits, assume
this is a phone number and print it."""

ph_nbr_pattern =  r'^(\d{10})(?:\s|$)'
compile_obj = re.compile(ph_nbr_pattern)

file2read = open(phonefile, 'r')
for currentline in file2read:
    match_obj = compile_obj.search(currentline)
    if match_obj:
        print currentline.rstrip()
    
file2read.close()
"""Output is:
4616186224
3501292628
2698109000
4398248508
8462632398
5414846117
9167449701
5097458418
"""

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

mysticstylez 0 Newbie Poster · Answer 1 · 2010-05-04T00:49:17+00:00

sorry i shouldve been more specific. im validating US phone numbers.

heres what the txt file looks like:

4616186224
3501292628
2698109000
4398248508
8462632398
5414846117
9167449701
5097458418
4F882945714
456729994744
4563249

I need to filter out phone numbers that have invalid char and that are too many or too few chars. Im guessing that regex for this is not needed. This can be done with list comprehension, but how?

mysticstylez 0 Newbie Poster · Answer 2 · 2010-05-04T06:10:16+00:00

mysticstylez 0 Newbie Poster

14 Years Ago

Awesome! Works like a charm, Thanks!

ultimatebuster 14 Posting Whiz in Training · Answer 3 · 2010-05-04T07:29:31+00:00

AMERICAN PHONE NUMBER!

import re
phonepattern = re.compile(r"(\d{3})\D*(\d{3})\D*(\d{4})")

Remember reading about it in the dive into python book. Too lazy to check the actual regex pattern, but this is what i cam up with off the top of my head.

snippsat 661 Master Poster · Answer 4 · 2010-05-04T12:42:55+00:00

@ultimatebuster
That regex vil match "456729994744"

import re

test_input = '456729994744'

if re.match(r'(\d{3})\D*(\d{3})\D*(\d{4})', test_input):
    print ('That is a valid input')  #That is a valid input  
else:
    print ('This is not a valid input')

A fix vil be this.
^(\d{3})\D*(\d{3})\D*(\d{4})$

Now it seems as this list only has numbers that shall match on lenght.
So no need to match against nr as (573)8841878 or 322-3223-222.
Then d5e5 solution work fine.

TrustyTony 888 pyMod Team Colleague Featured Poster · Answer 5 · 2010-05-04T15:02:38+00:00

Or do without re:

## tennumber filter without re

test="""4616186224
3501292628
2698109000
4398248508
8462632398
5414846117
9167449701
5097458418
4F882945714
456729994744
4563249
"""

def tennumbers(a):
    sep=[x for x in a if not x.isdigit()]
    if sep<>[] : return "" ## not numbers
    elif len(a) == 10: ## ten numbers and newline
        return a+'\n'  ## mayby more usefull than True/False
    else: return ""

print filter(tennumbers,test.splitlines())
print 'Or'
for i in test.splitlines():
    print tennumbers(i), ## will be space for every discarded number though

print 'Or like this:'
for i in test.splitlines():
    try:
        if 1e9<=int(i)<1e10:
            print i
    except ValueError as e:
        #print e
        pass

ultimatebuster 14 Posting Whiz in Training · Answer 6 · 2010-05-04T17:57:48+00:00

ultimatebuster 14 Posting Whiz in Training

14 Years Ago

what about extensions? That's why my Regex has no $

TrustyTony 888 pyMod Team Colleague Featured Poster · Answer 7 · 2010-05-04T18:08:48+00:00

Input given to accept had only normal numbers of 10 numbers to accept, nothing else. If need to match something more I need example of the form to accept and we can change the matching function

using regex on phone numbers

Recommended Answers Collapse Answers

All 10 Replies

Recommended Answers