Hello. I have 2 files which, and i want to supress one file against another. for example.

File 1 - suplist.txt
hello123
chris635
mike822

I then have another file

File2 - checklist.txt
stephen929
mike822
hiop191

I can see that mike 822 appears in the suppression list so i do not want this name in my file.

This is what i have so far, it works, but takes hours to run against a suppression with 1,000,000 records in

while read NAME
do
    grep $NAME suplist.txt > file
    wc=`wc -l < file`
     if [ $file -eq 0]
     then
          echo "Not found in other file so output"
          echo "$NAME" >> good.list
     fi

done < checklist.txt

Recommended Answers

All 3 Replies

Hey There,

I didn't do any speed testing and this is just off the top of my head (seems like there must be a better way using sed or awk), but give this a try. It should work and, hopefully work faster :)

grep -v `cat suplist.txt checklist.txt |sort|uniq -d` checklist.txt >good.list

Best wishes,

Mike

Thanks mike. Thats great. The only question i have is....
What if i have 3 fields in one file, and 4 in the other, but i only want to supress on one field (which exists in both files if that makes sense)

Hey there,

You'd probably have to go with sed or awk for that, like:

awk -v string=$1 '{ if $0 ~ /string/ print $1,$2,$4;else print $0}'

and go from there. There's no easy way to do a range in of fields in awk if you don't know what column the bad string is going to pop up in - assuming that's the field you don't want to print.

Check out this other threaded post for info on range printing in awk:

http://www.tek-tips.com/viewthread.cfm?qid=1117427&page=7

Best of luck to you,

Mike

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.