0

Hello

I want to sort a file and then uniq it, but ignoring the first field

so i have

REF | FOR | SUR
TLT090991|STEPHEN|GRIFFITHS
TLT090992|STEPHEN|GRIFFITHS

So i want to uniq but ignore the REF field

I had this but it doesnt work

cat $FILE | sort -t '|' -k2,3 | uniq > output

5
Contributors
8
Replies
9
Views
8 Years
Discussion Span
Last Post by JeoSaurus
0

So what actual output are you expecting from that?

Does it matter which of TLT090991 or TLT090992 you get?

Reading the manual page seems to offer some ideas.

0

You can also "cut" out the first field before you sort, if you want :)

, Mike

0

Sorry?

I must have missed a post, because I could swear that you never posted any indication that you'd bothered to RTFM for uniq.

I posted a link at #2, and that's all the spoon-feeding you're going to get until you post something more concrete as to what you tried, the results (or lack thereof).

0

Sorry?

I must have missed a post, because I could swear that you never posted any indication that you'd bothered to RTFM for uniq.

I posted a link at #2, and that's all the spoon-feeding you're going to get until you post something more concrete as to what you tried, the results (or lack thereof).

Thats ok, no need to apologise

-1

cat $FILE |wc -l
3
cat $FILE | sort -t '|' -k2,3 |tail -2| uniq > output

list of lines - 1 line of the first line.

Is this u want?

Votes + Comments
Maybe, but you're 3 YEARS TOO LATE to make a difference
1

Hi all! This can be done with a fairly simple one-liner!

As Salem pointed out, all the clues are in the man page, but I personally found the solution in 'sort' rather than 'uniq':

# The test file with your sample data
-> cat test.txt 
REF | FOR | SUR
TLT090991|STEPHEN|GRIFFITHS
TLT090992|STEPHEN|GRIFFITHS

#Test run with one-line sort command
-> sort -t\| -k2 -u test.txt 
REF | FOR | SUR
TLT090991|STEPHEN|GRIFFITHS

A trip through the man page for 'sort' reveals the following:

-t, --field-separator=SEP
use SEP instead of non-blank to blank transition

-k, --key=POS1[,POS2]
start a key at POS1, end it at POS2 (origin 1)

-u, --unique
with -c, check for strict ordering; without -c, output only the first of an equal run


TL;DR version:
-t sets the field separator, in this case "|" (escaped with \)
-k tells it which field to start with
-u says only print out unique lines (taking into consideration our starting position)

I hope this helps!
-G

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.