hi,

i'm new to python. i want to read from a text file (as attached) and i want to plot a scatterplot. i want to plot lane as X-axis, EyVt and EyHt as Y-axis.
i have a sample code but i need help on how to get python start reading column Lane, EyVt and EyHt. Pls help. tq

import numpy as np
import pylab as pl

data=np.loadtxt('sampledata.txt')

pl.plot(data[:,0],data[:,1],'ro')
pl.xlabel('x')
pl.ylabel('y')
pl.xlim(0.0,10.)

pl.show()

eg of text file content:
Platform: PC
Tempt : 25
TAP0 :0
TAP1 :1

+++++++++++++++++++++++++++++++++++++++++++++
Port Chnl Lane EyVt EyHt
+++++++++++++++++++++++++++++++++++++++++++++
0 1 1 75 55
0 1 2 10 35
0 1 3 25 35
0 1 4 35 25
0 1 5 10 20
+++++++++++++++++++++++++++++++++++++++++++++
Time: 20s

Recommended Answers

All 13 Replies

Unless there is going to be the same amount of lines every time in the data files I would use regular expressions
Heres a way that just works (note that its probably not the best way to write it :P, im not very used to re's yet).
What it does is reads through the whole file and if it matches the expressions. Then x.group(1) will be your lane, x.group(2) EyVt, and x.group(3)EyHt. Itll do this as many times as it has too. You can have it with 5 lines of data or 500.

import re
file = open("C:/Users/Enders/Desktop/sampledata.txt", "r")

for line in file:
    x = re.search("\d+\s+\d+\s+(\d+)\s+(\d+)\s+(\d+)", line)
    if x != None:
        print(x.group(1))
        print(x.group(2))
        print(x.group(3))

http://docs.python.org/library/re.html
For more on regular expresions.

thanks for your help. could you pls explain what does:\d+\s+\d+\s+(\d+)\s+(\d+)\s+(\d+
mean? i don't understand how it is able to find the Lane, EyVt, EyHt columns by using this.

tq

\d matches any digit (1 2 3 4) (so no letters) the + operator makes it repeat so \d+ would be true for (1 345 23 563456).
\s matches any whitespace character (tab space) + has same effect again.

the () means that its in a group. (\d+) So the first one gets assigned to .group(1) next to 2 and so on.

so \d+\s+ means that itll match "0 " The second will match "1 " and so on. Makes sense?

edit: like I said this isn't the best way to write the expressions (its very messy) Im pretty new at it, but it works! ^_^

thanks for u r patience. re can be confusing. :)
how does it know that it needs to skip 7 lines and then skip Port, Chnl columns before extracting data from Lane, EyVt, EyHt?
how does know to stop extracting before reaching line +++++

Because those lines dont match the expression, it returns None. (You can confirm this by trying to remove if x != None: and youll get an error). It will only have data in its group if it found something in the line that matches the expression.

i now combined the re matching code to get a plot but i only get 1 dot in the graph. appreciate any suggestions?

import re
import numpy as np
import pylab as pl

file = open("C:/Python25/myscript/plot/sampledata.txt", "r")

for line in file:
    x = re.search("\d+\s+\d+\s+(\d+)\s+(\d+)\s+(\d+)", line)
    if x != None:
##        print(x.group(1))       
##        print(x.group(2))
##        print(x.group(3))
        x1=x.group(1)
        y1=x.group(2)
        y2=x.group(3)
        

plot1=pl.plot(x1,y1,'r')
plot2=pl.plot(x1,y2,'go')


pl.title('Plot of y vs x')

pl.xlabel('x axis')
pl.ylabel('y axis')

pl.xlim(0.0,9.0)
pl.ylim(0.0,90.0)

pl.legend([plot1,plot2],('red line', 'green circles'),'best',numpoints=1)

pl.show()

Store it into an array. You're overwriting your variables everytime it loops again lol.

Also you should've changed the variables to something that sort of represents what they are...

If you have a lot of trouble ill post some code but you should be able to figure this out yourself.
hint: initialize an array and array.append all the points to it.

i put x1, y1, y2 to capture the extracted element into arrays. when i print it i get correct:
('1', '75')
('1', '55')
('2', '10')
('2', '35')
('3', '25')
('3', '35')
('4', '35')
('4', '25')
('5', '10')
('5', '20')

could you pls show how to loop the arrays? tq

import re
import numpy as np
import pylab as pl

file = open("C:/Python25/myscript/plot/sampledata.txt", "r")

for line in file:
    x = re.search("\d+\s+\d+\s+(\d+)\s+(\d+)\s+(\d+)", line)
    if x != None:
##        print(x.group(1))       
##        print(x.group(2))
##        print(x.group(3))
        x1=x.group(1)
        y1=x.group(2)
        y2=x.group(3)
        print (x1,y1)
        print (x1,y2)
##        plot1=pl.plot(x1,y1,'ro')
##        pl.show()
        

##plot1=pl.plot(x1,y1,'ro')
##plot2=pl.plot(x1,y2,'go')
##
##plot1=pl.plot(x1,y1,'ro')
##plot2=pl.plot(x1,y2,'go')
##
##
##pl.title('Plot of y vs x')
##
##pl.xlabel('x axis')
##pl.ylabel('y axis')
##
##pl.xlim(0.0,9.0)
##pl.ylim(0.0,90.0)
##
##pl.legend([plot1,plot2],('red circles', 'green circles'),'best',numpoints=1)
##
##pl.show()

I was looking for different ways of writing the regular expression (I thought it looked messy) found another way.
Anyways if you wanna keep original code then.

x1 = []
y1 = []
y2 = []   
for line in file:
    x = re.search("\d+\s+\d+\s+(\d+)\s+(\d+)\s+(\d+)", line)
    if x != None:
        x1.append(x.group(1))
        y1.append(x.group(2))
        y2.append(x.group(3))
print(x1,y1,y2)

Second implementation

x1 = []
y1 = []
y2 = []
for line in file:
    numbers = re.findall("\d+", line)
    if len(numbers) == 5:
        x1.append(numbers[2])
        y1.append(numbers[3])
        y2.append(numbers[4])
print(x1, y1, y2)

Then loop through arrays later when actually plotting.

commented: I like your second approach +13

ok. now i think i'm looping, i'm using for item in x1, y1,y2 after declaring it as arrays. but when i put the plot statement it'll plot only one value, it doesn't seem to be iterating.

for line in file:
    x = re.search("\d+\s+\d+\s+(\d+)\s+(\d+)\s+(\d+)", line)
    if x != None:
##        print(x.group(1))       
##        print(x.group(2))
##        print(x.group(3))

        x1=x.group(1)
        y1=x.group(2)
        y2=x.group(3)
        #print (x1,y1)
        #print (x1,y2)


        for item in x1,y1,y2:
            print (x1,y1,y2)

            plot1=pl.plot(x1,y1,'ro')
            plot2=pl.plot(x1,y2,'go')
            pl.show()

matplotlib doesnt work for python3 so I cant really help you anymore.
Also just wanna point out that you arent really looping.

From what I read in the docs this code should plot all your lines. Not too sure though.

x1 = []
y1 = []
y2 = []
for line in file:
    numbers = re.findall("\d+", line)
    if len(numbers) == 5:
        x1.append(numbers[2])
        y1.append(numbers[3])
        y2.append(numbers[4])
pl.plot(x1,y1,'ro')
pl.plot(x1,y2,'go')

I gotta go now, good luck.

thanks for your help. i'll dig further and will post out once i find the solution. btw, i'm using Python 2.5 and Win XP.

thanks for all the advice.
Solution:

import re
import numpy as np
import pylab as pl

file = open("C:/Python25/myscript/plot/sampledata.txt", "r")

x1 = []
y1 = []
y2 = []
for line in file:
    numbers = re.findall("\d+", line)
    if len(numbers) == 5:
        x1.append(numbers[2])
        y1.append(numbers[3])
        y2.append(numbers[4])
plot1=pl.plot(x1,y1,'ro')
plot2=pl.plot(x1,y2,'go')

pl.title('Plot of y vs x')
pl.xlabel('x axis')
pl.ylabel('y axis')

pl.xlim(0.0,9.0)
pl.ylim(0.0,90.0)

pl.legend([plot1,plot2],('red circles', 'green circles'),'best',numpoints=1)

pl.show()
pl.show()
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.