943,833 Members | Top Members by Rank

Ad:
You are currently viewing page 1 of this multi-page discussion thread
May 14th, 2004
0

Help with SED/AWK email parser

Expand Post »
I have a bunch of text files with different formats that somewhere in the file have email addresses. I would like to be able to parse through any number of these files for email addresses. Here are the types of input:

CFO: some_cfo@domain.com

misterman@domain.com

The Main Man mainman@domain.com

To take care of the situations I have the following seds:

#Removes line with an opening title
sed -e 's/^.*://'

#Removes opening and closing whitepsace
sed -e 's/^[ ^t]*//;s/[ ^t]*$//'

Those are both really simple, but for the life of me I can't figure out how to remove normal text from before the email address. I either end up clobbering the whole thing, or including it.

I just need to end up with something like keep what is directly attached to the '@' and delete anything after or before other whitespace

The Main Man mainman@domain.com
^not part of email. ^ and ^ are both parts of email.

Any clues anyone?
Last edited by i686-linux; May 14th, 2004 at 2:56 pm. Reason: Formatting error
Reputation Points: 87
Solved Threads: 12
Posting Whiz in Training
i686-linux is offline Offline
208 posts
since Mar 2004
May 14th, 2004
0

Re: Help with SED/AWK email parser

Quote originally posted by i686-linux ...
I just need to end up with something like keep what is directly attached to the '@' and delete anything after or before other whitespace

The Main Man mainman@domain.com
^not part of email. ^ and ^ are both parts of email.
And grep saves the day. Next time I'll RTFM better.

grep -o "[[:alnum:][:graph:]]*@[[:alnum:][:graph:]]*"

I haven't tested for many bugs/quirks in the results yet, but a few quick checks seemed to work fine.

If anyone has any further ideas though they would still be greatly appreciated!
Reputation Points: 87
Solved Threads: 12
Posting Whiz in Training
i686-linux is offline Offline
208 posts
since Mar 2004
May 14th, 2004
0

Re: Help with SED/AWK email parser

Here are a few I've used in the past (dunno if they'll work for you since SED isn't my strong point):

# get return address header
sed '/^Reply-To:/q; /^From:/h; /./d;g;q'

# get Subject header, but remove initial "Subject: " portion
sed '/^Subject: */!d; s///;q'

# get return address header
sed '/^Reply-To:/q; /^From:/h; /./d;g;q'

# parse out the address proper. Pulls out the e-mail address by itself
# from the 1-line return address header (see preceding script)
sed 's/ *(.*)//; s/>.*//; s/.*[:<] *//'
TKS
Reputation Points: 108
Solved Threads: 18
Posting Pro in Training
TKS is offline Offline
470 posts
since Jan 2004
May 14th, 2004
0

Re: Help with SED/AWK email parser

Those look awfully familiar. Did you grab those off of a "100 useful SED scripts" site?
Reputation Points: 87
Solved Threads: 12
Posting Whiz in Training
i686-linux is offline Offline
208 posts
since Mar 2004
May 14th, 2004
0

Re: Help with SED/AWK email parser

I got them from my friend Josh who probably did pull them off that very site
TKS
Reputation Points: 108
Solved Threads: 18
Posting Pro in Training
TKS is offline Offline
470 posts
since Jan 2004
May 14th, 2004
0

Re: Help with SED/AWK email parser

hello,

I am working with:

cat filename.txt | grep @

and getting the email names reduced to something in the one line. I am thinking that this will help. What I wonder is if we can get grep to simply output the found expression instead of the whole dang line.

I am also wondering if AWK will do what you need.

Christian
Team Colleague
Reputation Points: 121
Solved Threads: 57
Posting Virtuoso
kc0arf is offline Offline
1,629 posts
since Mar 2004
May 14th, 2004
0

Re: Help with SED/AWK email parser

Wow. Looks like a bunch of us working on it at the same time. Cool.

Christian
Team Colleague
Reputation Points: 121
Solved Threads: 57
Posting Virtuoso
kc0arf is offline Offline
1,629 posts
since Mar 2004
May 14th, 2004
0

Re: Help with SED/AWK email parser

Quote originally posted by kc0arf ...
cat filename.txt | grep @

and getting the email names reduced to something in the one line. I am thinking that this will help. What I wonder is if we can get grep to simply output the found expression instead of the whole dang line.
Christian
That is what I posted about:

grep -o "[[:alnum:][:graph:]]*@[[:alnum:][:graph:]]*"

grep -o returns the matched expression instead of the whole line matched

I realized that this can be cut down to:

grep -o "[[:graph:]]*@[[:graph:]]*"
Reputation Points: 87
Solved Threads: 12
Posting Whiz in Training
i686-linux is offline Offline
208 posts
since Mar 2004
May 14th, 2004
0

Re: Help with SED/AWK email parser

Using SED...you could also find a pattern similar to the grep -o

sed -n 's/.*\(pattern\).*/\1/p' file


Is the * in your grep -o example = to any character? I've never used that in a grep command before...
TKS
Reputation Points: 108
Solved Threads: 18
Posting Pro in Training
TKS is offline Offline
470 posts
since Jan 2004
May 14th, 2004
0

Re: Help with SED/AWK email parser

* = any ammount of matches of the previous expression

For example:

[[:graph:]]* is really "Any printable and visible (non-space) character repeated any number of times"
Reputation Points: 87
Solved Threads: 12
Posting Whiz in Training
i686-linux is offline Offline
208 posts
since Mar 2004

This thread is solved

Either the thread starter or a moderator has marked this thread as solved. You can most likely trust the responses and answers given. There is most likely no reason for any further responses to be posted here. If you have a related question, please start a new thread in this forum instead.

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in Shell Scripting Forum Timeline: Shell script to create a PostgreSQL database
Next Thread in Shell Scripting Forum Timeline: Scripting project





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC