shell script for assigning unique id for strings in file

Question

aditya369 2 Newbie Poster

12 Years Ago

I have been working with web traces and i need to change a particular format of the web trace to another.

for example

from

cs21 793468639 122791 173 icons-html/HomePageIcon.gif ALT=" 0 0.0
cs21 793468639 122791 173 icons-html/HomePageIcon.gif ALT=" 0 0.0
cs21 793468639 122791 173 icons-html/HomePageIcon.gif ALT=" 0 0.0

to

cs21 793468639 122791 173 -unique id- 0 0.0
cs21 793468639 122791 173 -unique id- 0 0.0
cs21 793468639 122791 173 -unique id- 0 0.0

in each line, the 5th colume (which is a string) has to be converted to a unique-id (integer value).there are few thousands of such lines in the file (textfile).
how should i proceed to do this in shell script ?
Is there a way to do this ?

please help me out !

strings will repeat.so it should able to generate unique numbers to each string.

shell-scripting

2 Contributors
9 Replies
369 Views
2 Weeks Discussion Span
Latest Post 12 Years Ago Latest Post by aditya369

Recommended Answers

Answered by Watael 4 in a post from 12 Years Ago

hi,

awk is the tool.

it uses associative arrays, which subscript could be the string, and which value could be incremented if subscript isn't in array.

so, if string is in array, then change string with array's value
else increment value.

Jump to Post

Answered by Watael 4 in a post from 12 Years Ago

awk is the tool to use
it's easier to ask awk if an element is in an array (or not) than ask the same to bash.
echo "abc def ghi
jkl def mno
abc pqr stu" | awk '!($2 in a ){a[$2]=++n}{$2=a[$2];print}'
abc 1 ghi
jkl 1 …

Jump to Post

Answered by Watael 4 in a post from 12 Years Ago

hi,

yet pretty easy ;)

awk '{n=split($0,a,"/"); s=a[1]a[2]a[3]; if(s in A){}else{A[s]=++m}; sub(s,""); print A[s],$0}' UrFile
1 /"
1 /lib/logo/Ray_Chou_Logo0.small.gif"
1 /pointers/Home.html"
1 /lib/pics/bu-logo.gif"
2 /cstr/search"
2 /cstr/search"
2 /cstr/search?csl-tr-92-505"
2 /cstr/search"
1 /faculty/heddaya/navigation.html"
3 /TR/Search/"
3 /TR/Search/?olean=and&author=Singh&title=&abstract=foreign"
3 /TR/Search/"

Jump to Post

All 9 Replies

Watael 4 Junior Poster

12 Years Ago

hi,

awk is the tool.

it uses associative arrays, which subscript could be the string, and which value could be incremented if subscript isn't in array.

so, if string is in array, then change string with array's value
else increment value.

Watael 4 Junior Poster

12 Years Ago

awk is the tool to use
it's easier to ask awk if an element is in an array (or not) than ask the same to bash.

echo "abc def ghi
jkl def mno
abc pqr stu" | awk '!($2 in a ){a[$2]=++n}{$2=a[$2];print}'
abc 1 ghi
jkl 1 mno
abc 2 stu

you see how simple it is.

Watael 4 Junior Poster

12 Years Ago

hi,

yet pretty easy ;)

awk '{n=split($0,a,"/"); s=a[1]a[2]a[3]; if(s in A){}else{A[s]=++m}; sub(s,""); print A[s],$0}' UrFile
1 /"
1 /lib/logo/Ray_Chou_Logo0.small.gif"
1 /pointers/Home.html"
1 /lib/pics/bu-logo.gif"
2 /cstr/search"
2 /cstr/search"
2 /cstr/search?csl-tr-92-505"
2 /cstr/search"
1 /faculty/heddaya/navigation.html"
3 /TR/Search/"
3 /TR/Search/?olean=and&author=Singh&title=&abstract=foreign"
3 /TR/Search/"

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

aditya369 2 Newbie Poster · Answer 1 · 2013-02-15T03:14:09+00:00

could you please explain me a bit more... how do we actually pass a string to an array ??
i have tried but end up getting errors.. i use ubuntu 10 and bash for shell script

aditya369 2 Newbie Poster · Answer 2 · 2013-02-15T08:33:45+00:00

thanks alot !!! ya it was simple but i was struck somewhere and couldnt get it...
but now i got it... thank u so much

aditya369 2 Newbie Poster · Answer 3 · 2013-02-24T09:41:06+00:00

i have got another problem. Even this also the same conversion but its a bit complicated. the thing is now i want a unique number for a substring of a string from a text file containing few thousands of lines(strings).

if a substring repeats in any other line and it was assigned a number then the same number must occur.

for example:: the lines in the text file look like

"http://cs-www.bu.edu/"
"http://cs-www.bu.edu/lib/logo/Ray_Chou_Logo0.small.gif"
"http://cs-www.bu.edu/pointers/Home.html"
"http://cs-www.bu.edu/lib/pics/bu-logo.gif"
"http://www.cs.indiana.edu/cstr/search"
"http://www.cs.indiana.edu/cstr/search"
"http://www.cs.indiana.edu/cstr/search?csl-tr-92-505"
"http://www.cs.indiana.edu/cstr/search"
"http://cs-www.bu.edu/faculty/heddaya/navigation.html"
"http://cs-tr.cs.cornell.edu/TR/Search/"
"http://cs-tr.cs.cornell.edu/TR/Search/?olean=and&author=Singh&title=&abstract=foreign"
"http://cs-tr.cs.cornell.edu/TR/Search/"

in this the server part of it ie ("http://cs-www.bu.edu) must be assigned a number and should be same in all lines and also if it changes the number should increment by 1 and then continue till end of file.

so the output looks like ::

1/"
1/lib/logo/Ray_Chou_Logo0.small.gif"
1/pointers/Home.html"
1/lib/pics/bu-logo.gif"
2/cstr/search"
2/cstr/search"
2/cstr/search?csl-tr-92-505"
2/cstr/search"
1/faculty/heddaya/navigation.html"
3/TR/Search/"
3/TR/Search/?olean=and&author=Singh&title=&abstract=foreign"
3/TR/Search/"

aditya369 2 Newbie Poster · Answer 4 · 2013-02-24T09:41:28+00:00

aditya369 2 Newbie Poster

12 Years Ago

please help me out !!!!! please

Watael 4 Junior Poster · Answer 5 · 2013-02-25T01:20:02+00:00

oops :(
I did a little mistake, s variable needs // for sub() to remove server address

awk '{n=split($0,a,"/"); s=a[1]"//"a[3]; if(s in A){}else{A[s]=++m}; sub(s,""); print A[s],$0}' UrFile

aditya369 2 Newbie Poster · Answer 6 · 2013-02-25T15:04:39+00:00

aditya369 2 Newbie Poster

12 Years Ago

thanks dude !! thanks alot !!!

shell script for assigning unique id for strings in file

Recommended Answers Collapse Answers

All 9 Replies

Recommended Answers