My problem involves reading in and working with an excel file that contains blank cells randomly throughout the spreadsheet. The reason there are blank cells is because I have expression data for various cell types but for some of the cell types the expression data was not available. My program works great (Finding standard deviations, averages etc.) when I fill the empty cells with 0, but when I leave them blank I get the following error:
ValueError: could not convert string to float:
Basically when I try to append the data to a new array and convert it to floats I get the error because of the blank cells. If I don't do the float function, I get errors everywhere in the programd because I can't do the Stdev, for example I will have the following error:
withinA1 = numpy.std(a1)
File "C:\Python27\lib\site-packages\numpy\core\fromnumeric.py", line 2433, in std
return _wrapit(a, 'std', axis, dtype, out, ddof)
File "C:\Python27\lib\site-packages\numpy\core\fromnumeric.py", line 37, in _wrapit
result = getattr(asarray(obj),method)(*args, **kwds)
TypeError: cannot perform reduce with flexible type
How can I read through and tell python that the empty cells should just be skipped over? I've tried:
for label in Reader:
while i < len(label):
# if either the item in col1 or col2 is empty remove both of them
if label[i] == '':
# otherwise, increment the index
i += 1
if len(label) == 0 or label[j] == '':
i += 1
for row in Reader:
if len(row) == 0 or row[i] == '':
To no avail. Any help would be greatly appreciated, thank you.
"""Generate a sequence of floats by converting every item in
a given sequence to float, and ignoring failing conversions"""
for x in sequence:
Now if you want to append all the values in columns 4, 5, ... for all rows,
you can write a generator
"""Generate the cell values that we want to convert and append to Resistance"""
for row in reader:
for x in row[4:]:
The code does work indeed, the only problem is that now it is one large array, and I need to be able to distinguish the original rows from one another so that I may calculate standard deviations as I see fit. Sorry if that was not clear
Yes Resistance is a list of lists. My raw data is in the form of an excel spreadsheet. I have A1, A2, B1..B6, C1-C15, etc... My goal is to calculate a running standard deviation of each column (index) from A1 to A2, so STDEV OF A1A2, A1A2 etc. for each of the subclasses. I will then find the average standard deviation within each subclass
Great, I guess I'm more of a beginner than I thought.
I am now getting this error:
ValueError: setting an array element with a sequence.
and I believe it is because since we took out some zeros, the arrays are of different sizes and the stdev indexing gets thrown off. Is there a way to tell python to completely ignore an index position rather than no longer assuming it's not there?