My problem involves reading in and working with an excel file that contains blank cells randomly throughout the spreadsheet. The reason there are blank cells is because I have expression data for various cell types but for some of the cell types the expression data was not available. My program works great (Finding standard deviations, averages etc.) when I fill the empty cells with 0, but when I leave them blank I get the following error:
ValueError: could not convert string to float:
Basically when I try to append the data to a new array and convert it to floats I get the error because of the blank cells. If I don't do the float function, I get errors everywhere in the programd because I can't do the Stdev, for example I will have the following error:
withinA1 = numpy.std(a1)
File "C:\Python27\lib\site-packages\numpy\core\fromnumeric.py", line 2433, in std
return _wrapit(a, 'std', axis, dtype, out, ddof)
File "C:\Python27\lib\site-packages\numpy\core\fromnumeric.py", line 37, in _wrapit
result = getattr(asarray(obj),method)(*args, **kwds)
TypeError: cannot perform reduce with flexible type
How can I read through and tell python that the empty cells should just be skipped over? I've tried:
for label in Reader:
while i < len(label):
# if either the item in col1 or col2 is empty remove both of them
if label[i] == '':
# otherwise, increment the index
i += 1
if len(label) == 0 or label[j] == '':
i += 1
for row in Reader:
if len(row) == 0 or row[i] == '':
To no avail. Any help would be greatly appreciated, thank you.
Well, here is something which should always work ... Now if you want to append all the values in columns 4, 5, ... for all rows, you can write a generator ... Finally, here is how to append all the values ... Perhaps you could describe what is this variable Resistance. Is it a list, a list of lists ? What is your expected content for Resistance ?
"""Generate a sequence of floats by converting every item in
a given sequence to float, and ignoring failing conversions"""
for x in sequence:
Now if you want to append all the values in columns 4, 5, ... for all rows, you can write a generator
"""Generate the cell values that we want to convert and append to Resistance"""
for row in reader:
for x in row[4:]:
The code does work indeed, the only problem is that now it is one large array, and I need to be able to distinguish the original rows from one another so that I may calculate standard deviations as I see fit. Sorry if that was not clear
Yes Resistance is a list of lists. My raw data is in the form of an excel spreadsheet. I have A1, A2, B1..B6, C1-C15, etc... My goal is to calculate a running standard deviation of each column (index) from A1 to A2, so STDEV OF A1A2, A1A2 etc. for each of the subclasses. I will then find the average standard deviation within each subclass
Great, I guess I'm more of a beginner than I thought.
I am now getting this error:
ValueError: setting an array element with a sequence.
and I believe it is because since we took out some zeros, the arrays are of different sizes and the stdev indexing gets thrown off. Is there a way to tell python to completely ignore an index position rather than no longer assuming it's not there?