Hi,

I'm trying to apply a linear regressipn function to my numpy array and then store the results in a new array. But i have 2 things that are not working for me.

def regressioncal(valarray):
    new_col = []
    linregres
    valarray = numpy.array(valarray)
    l=20000
    for t in xrange(1,5000,10):
        for j in xrange(1,5000,10):
            for di in range(len(valarray)):
                for dj in range(len(valarray[di])):
                    if(sum(t,j,100) >= l/2):
                            new_col.append(l - j - valarray[di][8])
                        else: 
                            new_col.append(l - (t + valarray[di][8]/2))

    numpy.insert(valarray, len(valarray)+1, 1, axis=1)

            slope, intercept, r_value, p_value, std_err = stats.linregress(valarray[:,8:9],valarray[:,5:6])  


            line = slope*valarray[:,5:6]+intercept
            err=sqrt(sum((line-valarray[:,5:6])**2)/len(valarray[:,5:6]))

            linregres.append((t,j,slope,intercept, r_value, p_value, std_err,err))

    return valarray

so basically i want to apply the linear regression to specific columns in the array, and one of them is the new_column that i'm trying to append to the array before calculationg the regression.

but the problem is :
1- the new_column is not being appended to the array
2- the identation problem : numpy.insert should be outside the loops but the linear regression calculation should be inside the (t,j) loops in order to get different regression for each combination.

Thank you for your help.

If you use numpy, then you want to avoid iterating item by item. In fact, the whole reason to use numpy, is that you don't have to do this. If you do do this, you get no speed increase over plain python (the speed increase is about 1000-fold btw).

You can do linear regression out of the box in python already:

http://glowingpython.blogspot.com/2012/03/linear-regression-with-numpy.html

If you are making your own for a class or something, you still want to avoid itearting over the array itself. It does take some practice to get used to this.

Comments
nice help and link

the idea is in the first place to fill the numpy array's new column inside the loop because i'm not able to do this ... and then i can check how can i minimise the regression's run time

This article has been dead for over six months. Start a new discussion instead.