Hi,

I am simulating wireless networks and the simulator keeps putting runtime data to the output and eventually to my results file.

to give you an idea


....
1 8724593564 2 153 465
1 8725120550 14 900 259
Node 0 sends packet to 1
1 8725375953 22 654 339
1 8725533894 24 481 438
1 8725911788 25 508 488
1 8726442360 16 326 297
Node 0 sends packet to 3
1 8727100680 7 611 87
1 8727271901 6 577 125
1 8727967413 12 223 189
1 8728656825 3 278 262
1 8728757940 17 394 304
1 8728987830 1 155 192
1 8729210880 26 541 520
1 8729924822 11 314 191
1 8730971290 5 788 119
1 8732720640 29 496 172
1 8733388914 20 305 378
Node 0 sends packet to 5
1 8733771145 21 413 392
1 8733815812 23 517 368
1 8735005533 28 423 441
1 8735786288 19 658 300
1 8735894851 2 153 466
1 8736065396 27 570 558
......

How can I remove the unnecessary data so that my file looks like
Node 0 sends packet to 1
Node 0 sends packet to 3
Node 0 sends packet to 5

the unnecessary lines always start with '1 ' and always have 5 numbers spaced by ' '

I do not know scripting languages and would be grateful if you could write me a small code snippet for it.

Thanks*million in advance.

Some examples of the solutions, first filtering in lines starting "Node", then filtering out lines starting '1 '

In first solution I print out the list, in second I leave them in list from were they can be printed or used for other purposes.

test="""
1 8724593564 2 153 465
1 8725120550 14 900 259
Node 0 sends packet to 1
1 8725375953 22 654 339
1 8725533894 24 481 438
1 8725911788 25 508 488
1 8726442360 16 326 297
Node 0 sends packet to 3
1 8727100680 7 611 87
1 8727271901 6 577 125
1 8727967413 12 223 189
1 8728656825 3 278 262
1 8728757940 17 394 304
1 8728987830 1 155 192
1 8729210880 26 541 520
1 8729924822 11 314 191
1 8730971290 5 788 119
1 8732720640 29 496 172
1 8733388914 20 305 378
Node 0 sends packet to 5
1 8733771145 21 413 392
1 8733815812 23 517 368
1 8735005533 28 423 441
1 8735786288 19 658 300
1 8735894851 2 153 466
1 8736065396 27 570 55
"""
print "Solution 1"
for i in test.split('\n'):
    if i[:4]=='Node':
        print i
print        
print "Solution 2"
interesting=filter(lambda x:x[:2]!='1 ',test.split('\n'))

print '\n'.join(interesting) ## put line change between lines back and printout

With file (I asume you have one)

test="""
1 8724593564 2 153 465
1 8725120550 14 900 259
Node 0 sends packet to 1
1 8725375953 22 654 339
1 8725533894 24 481 438
1 8725911788 25 508 488
1 8726442360 16 326 297
Node 0 sends packet to 3
1 8727100680 7 611 87
1 8727271901 6 577 125
1 8727967413 12 223 189
1 8728656825 3 278 262
1 8728757940 17 394 304
1 8728987830 1 155 192
1 8729210880 26 541 520
1 8729924822 11 314 191
1 8730971290 5 788 119
1 8732720640 29 496 172
1 8733388914 20 305 378
Node 0 sends packet to 5
1 8733771145 21 413 392
1 8733815812 23 517 368
1 8735005533 28 423 441
1 8735786288 19 658 300
1 8735894851 2 153 466
1 8736065396 27 570 55
"""
open("results.log",'w').write(test) ## put to file for demonstrating file io
print "Solution 1"
for i in open("results.log").readlines():
    if i[:4]=='Node':
        print i, ## comma to stop automatic newline
print        
print "Solution 2"
interesting=filter(lambda x:x[:2]!='1 ',open("results.log").readlines())

print ''.join(interesting) ## line changes between lines are ready

Edited 6 Years Ago by pyTony: n/a

You can use module fileinput for this.

import fileinput

for line in fileinput.input("node.txt", inplace=1):
    line = line.strip()
    if not '1 87'in line:
        print line

'''Out--> to node.txt
Node 0 sends packet to 1
Node 0 sends packet to 3
Node 0 sends packet to 5
'''

Edited 6 Years Ago by snippsat: n/a

You can use module fileinput for this.

import fileinput

for line in fileinput.input("node.txt", inplace=1):
    line = line.strip()
    if not '1 87'in line:
        print line

'''Out--> to node.txt
Node 0 sends packet to 1
Node 0 sends packet to 3
Node 0 sends packet to 5
'''

What is benefit compared to filter(lambda x:x[:4]!='1 87',open("node.txt").readlines()) or for i in [x for x in open("node.txt").readlines() if x[:4] != '1 87']:print i, Is it faster?

If it faster i dont know,have to measure.
For this i think both solution will work fine.

Postet more as an alternative to your soultion,an show that module fileinput can be used for this.

Thanks tonyjv and snippsat.

I tried all the three solutions (to get a feel of python). They work great. I could not tell the difference in speed as my file is not very big.

However, I have another question on this issue. Is there a way to filter out lines based on the fact that there are 5 integers separated by ' '
I feel this would be a more robust criteria for my filtering.

Is there a way to filter out lines based on the fact that there are 5 integers separated by ' '

Yes you can look into regular expression. ^'\d{5}' a regular expression like this will match.
'12345' and not 12345 | '1234' | '12345 | 'a2345'

If you want exact limitation of exactly one space between numbers and 5 numbers, we can adapt my answer for processing date strings:

def fivenumbers(a):
    sep=[x for x in a if not x.isdigit()]
    return sep != [' ',' ',' ',' ','\n'] # spaces and newline in the end

interesting = filter(fivenumbers, open("results.log").readlines())
print ''.join(interesting)

hello everyone

i have a similair problem, in that i want to just print every other two lines. eg:
Bo 1
10.91675884 8.759276111 12.34200701
-143774.527596 91282.3793501 152261.183894
173746478225. -23015774263.5 -175995146367.
Hb 2
11.55042178 9.122090713 11.40870008
117513.645155 11027.3013922 296416.392561
-31785744573.8 -18668416080.5 47895685829.6
Bo 3
11.84308567 9.003647629 13.85050633
-183101.193384 50910.2028029 183005.373899
136999567683. -95909198829.5 -170919900644.
Bo 4
11.48525608 7.348425949 13.27992051
-42903.8751132 65201.1595020 52332.3899919
169806954528. 138159206534. -171816552662.
Bo 5
11.60191797 7.341722410 11.49674523
-18883.6823513 210206.941628 53116.4456438
123482822379. 175651368013. -44203978532.3
becomes:
Bo 1
10.91675884 8.759276111 12.34200701
Hb 2
11.55042178 9.122090713 11.40870008
Bo 3
11.84308567 9.003647629 13.85050633
Bo 4
11.48525608 7.348425949 13.27992051 etc.
apologies that there is no easy way to name each block of information, but each block does begin with two letters, and the number following each set of two letters rises by 1 each time, if these could be exploited.
thanks in advance

Ok, here you code which remembers when last line was printed and prints also next one after.

a= open('myfile.txt') # test
pr_this=False ## for printing next after alpha starting line

for i in a: 
    if pr_this:
        print i, ## line has newline, so use comma
        pr_this=False ## do not print until alpha line
    elif i and i[0].isalpha(): # not empty line and start with alphabet
        print i,
        pr_this=True ## print also next one
""" Output
Bo 1
10.91675884 8.759276111 12.34200701
Hb 2
11.55042178 9.122090713 11.40870008
Bo 3
11.84308567 9.003647629 13.85050633
Bo 4
11.48525608 7.348425949 13.27992051
Bo 5
11.60191797 7.341722410 11.49674523
"""

Edited 6 Years Ago by pyTony: n/a

Attachments
Bo 1
10.91675884 8.759276111 12.34200701
-143774.527596 91282.3793501 152261.183894
173746478225. -23015774263.5 -175995146367.
Hb 2
11.55042178 9.122090713 11.40870008
117513.645155 11027.3013922 296416.392561
-31785744573.8 -18668416080.5 47895685829.6
Bo 3
11.84308567 9.003647629 13.85050633
-183101.193384 50910.2028029 183005.373899
136999567683. -95909198829.5 -170919900644.
Bo 4
11.48525608 7.348425949 13.27992051
-42903.8751132 65201.1595020 52332.3899919
169806954528. 138159206534. -171816552662.
Bo 5
11.60191797 7.341722410 11.49674523
-18883.6823513 210206.941628 53116.4456438
123482822379. 175651368013. -44203978532.3

thankyou very much! i see you are exploiting the alphabet bit with .isalpha, although the bits about which lines to print with true and false and pr_this confuses me a little. is there any chance thats easy to explain?
thanks again isaac

You mean that is there easier way to understand. This is old style of programming with pr_this as a status flag. It is like pushing button "ready to print" after we printed start line and it is switched of right after one line as we only wanted one next line. It is there to keep the loop simple loop as the input is from file, which is sequential in nature.

If you want to process file in memory, there is maybe clearer, at least more Pythonic way like this:

## read all file line by line to list, strip newline & other white space from right
a=[x.rstrip() for x in open('myfile.txt')] 
num=2 ## we want two lines each time

# not empty line and start with alphabet and the next line
b=[a[i:i+num] for i in range(len(a)-num) if a[i] and a[i][0].isalpha()]

for i in b:
    print i

""" Output
>>> 
['Bo 1', '10.91675884 8.759276111 12.34200701']
['Hb 2', '11.55042178 9.122090713 11.40870008']
['Bo 3', '11.84308567 9.003647629 13.85050633']
['Bo 4', '11.48525608 7.348425949 13.27992051']
['Bo 5', '11.60191797 7.341722410 11.49674523']
>>> 
"""

Edited 6 Years Ago by pyTony: num variable added

also is it possible to get the output printed to a file rather than the console. would it be possible to append this sort of info:

fout = open('CONFIG', 'w')
for i in a:
fout.print(i)
fout.close()

which i have adapted from:
fout = open('CONFIG', 'w')
for ln in lines:
fout.write(ln)
fout.close()

fout = open('CONFIG', 'w')
for i in b:
    fout.write(i+'\n')

Please put code tags next time to your code from (code) button.
Tony

Edited 3 Years Ago by mike_2000_17: Fixed formatting

Sorry, I meant this, b was list of lists:

## read all file line by line to list, strip newline & other white space from right
a=[x.rstrip() for x in open('myfile.txt')] 
num=2 ## we want to lines each time

# not empty line and start with alphabet and the next line
b=[a[i:i+num] for i in range(len(a)-num) if a[i] and a[i][0].isalpha()]

fout = open('CONFIG.txt', 'w')
for i in b:
    for j in i:
##        print j # debug
        fout.write(j+'\n')
fout.close()

Edited 6 Years Ago by pyTony: n/a

Thanks to all the replies. This was my first post and I forgot to mark it as solved.

hello everyone. i have another text reordering problem, that i hope you would be able to help me with. i have uploaded a file called problem, which has columns of information. i would like to print out a file that has the the info in the third column printed out in lines of two, e.g. in solution. also, the columns are found in a broader text file, and if there were a way to search for them without cutting them out that might be interesting. the only distinctive bit about it would be a bit of text just before the information i'm interested in
thanks in advance
isaac

Attachments
1          10436.579759     10436.579757 Lennard-Jones A
         2             12.628547        12.628547 Lennard-Jones B
         3            123.089561       123.824960 Lennard-Jones A
         4              0.252853         0.247055 Lennard-Jones B
         5              7.812651         7.812651 Lennard-Jones A
         6              0.044149         0.042720 Lennard-Jones B
         7         159681.982693    159681.982666 Lennard-Jones A
         8             72.328997        72.328997 Lennard-Jones B
         9            705.509586       669.045590 Lennard-Jones A
        10              2.189752         2.214502 Lennard-Jones B
        11             50.955774        50.538842 Lennard-Jones A
        12              0.463123         0.464781 Lennard-Jones B
10436.579757	12.628547
123.824960	0.247055
etc etc

Like this.

f_in = open('problem.txt').readlines()
f_out = open('code_solution.txt', 'w')

for line in f_in:
    filtered = []
    data = line.rstrip('\n')
    [filtered.append(x) for x in data.split(' ')]
    count, numb1, numb2, name, letter = [x for x in filtered if x != '']

    print count, numb1, numb2, name, letter   # not needed for the file output

    f_out.write('%s  %s\n' % (numb1, numb2))

f_out.close()
Attachments
10436.579759  10436.579757
12.628547  12.628547
123.089561  123.824960
0.252853  0.247055
7.812651  7.812651
0.044149  0.042720
159681.982693  159681.982666
72.328997  72.328997
705.509586  669.045590
2.189752  2.214502
50.955774  50.538842
0.463123  0.464781

This was solved thread, but your output does not match the asked format for the second request.

My solution:

## join one field from each two consecutive lines

def readtwolines(filen):
    while True:
        lines=(filen.readline().split(), filen.readline().split())
        if lines[0]:
            yield lines
        else:
            return

result=[]
infile=open('problem.txt')
outfile = open('ready_solution.txt', 'w')

for (a_line, b_line) in readtwolines(infile):
    outfile.write(a_line[2]+' ')
    outfile.write(b_line[2]+'\n')

outfile.close()
print('Result file contents:')
print(open('ready_solution.txt').read())

""" Output:
Result file contents:
10436.579757 12.628547
123.824960 0.247055
7.812651 0.042720
159681.982666 72.328997
669.045590 2.214502
50.538842 0.464781
"""

I had a similar task, and i just adapted the nariables, but now I see that the output it's a litle different than i thought.

Sorry.

apologies for posting on a solved thread, i only just noticed. thankyou for this incredibly elegant solution, i am really new to this, and i can follow you exactly (!).

This question has already been answered. Start a new discussion instead.