How to delete lines with given pattern

Question

kosco 0 Newbie Poster

15 Years Ago

Hi,

I am simulating wireless networks and the simulator keeps putting runtime data to the output and eventually to my results file.

to give you an idea

....
1 8724593564 2 153 465
1 8725120550 14 900 259
Node 0 sends packet to 1
1 8725375953 22 654 339
1 8725533894 24 481 438
1 8725911788 25 508 488
1 8726442360 16 326 297
Node 0 sends packet to 3
1 8727100680 7 611 87
1 8727271901 6 577 125
1 8727967413 12 223 189
1 8728656825 3 278 262
1 8728757940 17 394 304
1 8728987830 1 155 192
1 8729210880 26 541 520
1 8729924822 11 314 191
1 8730971290 5 788 119
1 8732720640 29 496 172
1 8733388914 20 305 378
Node 0 sends packet to 5
1 8733771145 21 413 392
1 8733815812 23 517 368
1 8735005533 28 423 441
1 8735786288 19 658 300
1 8735894851 2 153 466
1 8736065396 27 570 558
......

How can I remove the unnecessary data so that my file looks like
Node 0 sends packet to 1
Node 0 sends packet to 3
Node 0 sends packet to 5

the unnecessary lines always start with '1 ' and always have 5 numbers spaced by ' '

I do not know scripting languages and would be grateful if you could write me a small code snippet for it.

Thanks*million in advance.

python

5 Contributors
22 Replies
286 Views
2 Months Discussion Span
Latest Post 14 Years Ago Latest Post by vukman

All 22 Replies

TrustyTony 888 ex-Moderator

15 Years Ago

Some examples of the solutions, first filtering in lines starting "Node", then filtering out lines starting '1 '

In first solution I print out the list, in second I leave them in list from were they can be printed or used for other purposes.

test="""
1 8724593564 2 153 465
1 8725120550 14 900 259
Node 0 sends packet to 1
1 8725375953 22 654 339
1 8725533894 24 481 438
1 8725911788 25 508 488
1 8726442360 16 326 297
Node 0 sends packet to 3
1 8727100680 7 611 87
1 8727271901 6 577 125
1 8727967413 12 223 189
1 8728656825 3 278 262
1 8728757940 17 394 304
1 8728987830 1 155 192
1 8729210880 26 541 520
1 8729924822 11 314 191
1 8730971290 5 788 119
1 8732720640 29 496 172
1 8733388914 20 305 378
Node 0 sends packet to 5
1 8733771145 21 413 392
1 8733815812 23 517 368
1 8735005533 28 423 441
1 8735786288 19 658 300
1 8735894851 2 153 466
1 8736065396 27 570 55
"""
print "Solution 1"
for i in test.split('\n'):
    if i[:4]=='Node':
        print i
print        
print "Solution 2"
interesting=filter(lambda x:x[:2]!='1 ',test.split('\n'))

print '\n'.join(interesting) ## put line change between lines back and printout

With file (I asume you have one)

test="""
1 8724593564 2 153 465
1 8725120550 14 900 259
Node 0 sends packet to 1
1 8725375953 22 654 339
1 8725533894 24 481 438
1 8725911788 25 508 488
1 8726442360 16 326 297
Node 0 sends packet to 3
1 8727100680 7 611 87
1 8727271901 6 577 125
1 8727967413 12 223 189
1 8728656825 3 278 262
1 8728757940 17 394 304
1 8728987830 1 155 192
1 8729210880 26 541 520
1 8729924822 11 314 191
1 8730971290 5 788 119
1 8732720640 29 496 172
1 8733388914 20 305 378
Node 0 sends packet to 5
1 8733771145 21 413 392
1 8733815812 23 517 368
1 8735005533 28 423 441
1 8735786288 19 658 300
1 8735894851 2 153 466
1 8736065396 27 570 55
"""
open("results.log",'w').write(test) ## put to file for demonstrating file io
print "Solution 1"
for i in open("results.log").readlines():
    if i[:4]=='Node':
        print i, ## comma to stop automatic newline
print        
print "Solution 2"
interesting=filter(lambda x:x[:2]!='1 ',open("results.log").readlines())

print ''.join(interesting) ## line changes between lines are ready

Edited 15 Years Ago by TrustyTony because: n/a

snippsat 661 Master Poster

15 Years Ago

You can use module fileinput for this.

import fileinput

for line in fileinput.input("node.txt", inplace=1):
    line = line.strip()
    if not '1 87'in line:
        print line

'''Out--> to node.txt
Node 0 sends packet to 1
Node 0 sends packet to 3
Node 0 sends packet to 5
'''

Edited 15 Years Ago by snippsat because: n/a

TrustyTony 888 ex-Moderator

15 Years Ago

You can use module fileinput for this.

import fileinput

for line in fileinput.input("node.txt", inplace=1):
    line = line.strip()
    if not '1 87'in line:
        print line

'''Out--> to node.txt
Node 0 sends packet to 1
Node 0 sends packet to 3
Node 0 sends packet to 5
'''

What is benefit compared to filter(lambda x:x[:4]!='1 87',open("node.txt").readlines()) or for i in [x for x in open("node.txt").readlines() if x[:4] != '1 87']:print i, Is it faster?

snippsat 661 Master Poster

15 Years Ago

If it faster i dont know,have to measure.
For this i think both solution will work fine.

Postet more as an alternative to your soultion,an show that module fileinput can be used for this.

snippsat 661 Master Poster

15 Years Ago

Is there a way to filter out lines based on the fact that there are 5 integers separated by ' '

Yes you can look into regular expression. ^'\d{5}' a regular expression like this will match.
'12345' and not 12345 | '1234' | '12345 | 'a2345'

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

kosco 0 Newbie Poster · Answer 1 · 2010-04-20T03:08:25+00:00

Thanks tonyjv and snippsat.

I tried all the three solutions (to get a feel of python). They work great. I could not tell the difference in speed as my file is not very big.

However, I have another question on this issue. Is there a way to filter out lines based on the fact that there are 5 integers separated by ' '
I feel this would be a more robust criteria for my filtering.

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 2 · 2010-04-20T04:13:32+00:00

If you want exact limitation of exactly one space between numbers and 5 numbers, we can adapt my answer for processing date strings:

def fivenumbers(a):
    sep=[x for x in a if not x.isdigit()]
    return sep != [' ',' ',' ',' ','\n'] # spaces and newline in the end

interesting = filter(fivenumbers, open("results.log").readlines())
print ''.join(interesting)

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 3 · 2010-04-23T13:25:19+00:00

TrustyTony 888 ex-Moderator

15 Years Ago

Can you mark solved or tell what is stil unclear?

vukman 0 Newbie Poster · Answer 4 · 2010-04-30T20:53:21+00:00

hello everyone

i have a similair problem, in that i want to just print every other two lines. eg:
Bo 1
10.91675884 8.759276111 12.34200701
-143774.527596 91282.3793501 152261.183894
173746478225. -23015774263.5 -175995146367.
Hb 2
11.55042178 9.122090713 11.40870008
117513.645155 11027.3013922 296416.392561
-31785744573.8 -18668416080.5 47895685829.6
Bo 3
11.84308567 9.003647629 13.85050633
-183101.193384 50910.2028029 183005.373899
136999567683. -95909198829.5 -170919900644.
Bo 4
11.48525608 7.348425949 13.27992051
-42903.8751132 65201.1595020 52332.3899919
169806954528. 138159206534. -171816552662.
Bo 5
11.60191797 7.341722410 11.49674523
-18883.6823513 210206.941628 53116.4456438
123482822379. 175651368013. -44203978532.3
becomes:
Bo 1
10.91675884 8.759276111 12.34200701
Hb 2
11.55042178 9.122090713 11.40870008
Bo 3
11.84308567 9.003647629 13.85050633
Bo 4
11.48525608 7.348425949 13.27992051 etc.
apologies that there is no easy way to name each block of information, but each block does begin with two letters, and the number following each set of two letters rises by 1 each time, if these could be exploited.
thanks in advance

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 5 · 2010-05-01T00:18:38+00:00

Ok, here you code which remembers when last line was printed and prints also next one after.

a= open('myfile.txt') # test
pr_this=False ## for printing next after alpha starting line

for i in a: 
    if pr_this:
        print i, ## line has newline, so use comma
        pr_this=False ## do not print until alpha line
    elif i and i[0].isalpha(): # not empty line and start with alphabet
        print i,
        pr_this=True ## print also next one
""" Output
Bo 1
10.91675884 8.759276111 12.34200701
Hb 2
11.55042178 9.122090713 11.40870008
Bo 3
11.84308567 9.003647629 13.85050633
Bo 4
11.48525608 7.348425949 13.27992051
Bo 5
11.60191797 7.341722410 11.49674523
"""

vukman 0 Newbie Poster · Answer 6 · 2010-05-04T15:10:18+00:00

thankyou very much! i see you are exploiting the alphabet bit with .isalpha, although the bits about which lines to print with true and false and pr_this confuses me a little. is there any chance thats easy to explain?
thanks again isaac

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 7 · 2010-05-04T15:40:22+00:00

You mean that is there easier way to understand. This is old style of programming with pr_this as a status flag. It is like pushing button "ready to print" after we printed start line and it is switched of right after one line as we only wanted one next line. It is there to keep the loop simple loop as the input is from file, which is sequential in nature.

If you want to process file in memory, there is maybe clearer, at least more Pythonic way like this:

## read all file line by line to list, strip newline & other white space from right
a=[x.rstrip() for x in open('myfile.txt')] 
num=2 ## we want two lines each time

# not empty line and start with alphabet and the next line
b=[a[i:i+num] for i in range(len(a)-num) if a[i] and a[i][0].isalpha()]

for i in b:
    print i

""" Output
>>> 
['Bo 1', '10.91675884 8.759276111 12.34200701']
['Hb 2', '11.55042178 9.122090713 11.40870008']
['Bo 3', '11.84308567 9.003647629 13.85050633']
['Bo 4', '11.48525608 7.348425949 13.27992051']
['Bo 5', '11.60191797 7.341722410 11.49674523']
>>> 
"""

vukman 0 Newbie Poster · Answer 8 · 2010-05-04T15:47:42+00:00

also is it possible to get the output printed to a file rather than the console. would it be possible to append this sort of info:

fout = open('CONFIG', 'w')
for i in a:
fout.print(i)
fout.close()

which i have adapted from:
fout = open('CONFIG', 'w')
for ln in lines:
fout.write(ln)
fout.close()

vukman 0 Newbie Poster · Answer 9 · 2010-05-04T15:48:43+00:00

vukman 0 Newbie Poster

15 Years Ago

but thankyou that is much more clear

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 10 · 2010-05-04T15:57:12+00:00

fout = open('CONFIG', 'w')
for i in b:
    fout.write(i+'\n')

Please put code tags next time to your code from (code) button.
Tony

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 11 · 2010-05-04T17:12:52+00:00

Sorry, I meant this, b was list of lists:

## read all file line by line to list, strip newline & other white space from right
a=[x.rstrip() for x in open('myfile.txt')] 
num=2 ## we want to lines each time

# not empty line and start with alphabet and the next line
b=[a[i:i+num] for i in range(len(a)-num) if a[i] and a[i][0].isalpha()]

fout = open('CONFIG.txt', 'w')
for i in b:
    for j in i:
##        print j # debug
        fout.write(j+'\n')
fout.close()

kosco 0 Newbie Poster · Answer 12 · 2010-06-23T02:57:55+00:00

Thanks to all the replies. This was my first post and I forgot to mark it as solved.

vukman 0 Newbie Poster · Answer 13 · 2010-07-01T19:46:31+00:00

hello everyone. i have another text reordering problem, that i hope you would be able to help me with. i have uploaded a file called problem, which has columns of information. i would like to print out a file that has the the info in the third column printed out in lines of two, e.g. in solution. also, the columns are found in a broader text file, and if there were a way to search for them without cutting them out that might be interesting. the only distinctive bit about it would be a bit of text just before the information i'm interested in
thanks in advance
isaac

Beat_Slayer 17 Posting Pro in Training · Answer 14 · 2010-07-01T22:00:57+00:00

Like this.

f_in = open('problem.txt').readlines()
f_out = open('code_solution.txt', 'w')

for line in f_in:
    filtered = []
    data = line.rstrip('\n')
    [filtered.append(x) for x in data.split(' ')]
    count, numb1, numb2, name, letter = [x for x in filtered if x != '']

    print count, numb1, numb2, name, letter   # not needed for the file output

    f_out.write('%s  %s\n' % (numb1, numb2))

f_out.close()

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 15 · 2010-07-01T23:04:41+00:00

This was solved thread, but your output does not match the asked format for the second request.

My solution:

## join one field from each two consecutive lines

def readtwolines(filen):
    while True:
        lines=(filen.readline().split(), filen.readline().split())
        if lines[0]:
            yield lines
        else:
            return

result=[]
infile=open('problem.txt')
outfile = open('ready_solution.txt', 'w')

for (a_line, b_line) in readtwolines(infile):
    outfile.write(a_line[2]+' ')
    outfile.write(b_line[2]+'\n')

outfile.close()
print('Result file contents:')
print(open('ready_solution.txt').read())

""" Output:
Result file contents:
10436.579757 12.628547
123.824960 0.247055
7.812651 0.042720
159681.982666 72.328997
669.045590 2.214502
50.538842 0.464781
"""

Beat_Slayer 17 Posting Pro in Training · Answer 16 · 2010-07-01T23:16:23+00:00

I had a similar task, and i just adapted the nariables, but now I see that the output it's a litle different than i thought.

Sorry.

vukman 0 Newbie Poster · Answer 17 · 2010-07-02T02:50:00+00:00

apologies for posting on a solved thread, i only just noticed. thankyou for this incredibly elegant solution, i am really new to this, and i can follow you exactly (!).

How to delete lines with given pattern

Recommended Answers Collapse Answers

All 22 Replies

Recommended Answers