We have the following dataframe (df)
print(df)
#Gene GSM772 GSM773 GSM774 GSM775 GSM776
0610007P14Rik 0.003485 0.003415 0.005431 0.003667 0.007146
0610009B22Rik 0.001220 0.001351 0.001762 0.001404 0.002177
0610009L18Rik 0.000055 0.000009 0.000152 0.000082 0.000179
0610009O20Rik 0.000000 0.006830 00000000 0.006653 0.006907
0610010F05Rik 0.008310 0.008329 0.007091 0.006919 0.006915
We want to calculate Geometric Mean for every row.
For some rows there are "zero" values, which needs to be ignored so the geometric mean for that row is regarded as zero.
We wrote the following python script,
import scipy
import numpy
import numpy as np
from scipy.stats.mstats import gmean
from scipy import stats
numpy.seterr(divide = 'ignore')
scipy.stats.gmean(df.iloc[:,1:5],axis=1)
gmean = scipy.stats.gmean(df.iloc[:,1:5],axis=1)
df.assign(GeometricMean=gmean)
results = df.assign(GeometricMean=gmean)
print(results)
Following error is encountered:
AttributeError: 'str' object has no attribute 'log'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "calculate_gmean.py", line 99, in <module>
scipy.stats.gmean(df.iloc[:,1:5],axis=1) #calculates gmean rowwise, axis=1 for rowwise
File "/home/.local/lib/python3.6/site-packages/scipy/stats/stats.py", line 402, in gmean
log_a = np.log(np.array(a, dtype=dtype))
TypeError: loop of ufunc does not support argument 0 of type str which has no callable log method
Can anyone please suggest the best way to resolve this issue?
Thanks !!
Problem solved. Actually, the above script works without any issue. Sorry, this question was posted without hindsight. We cannot delete any question, so this will stay here. Hope the script is useful for someone.
Note, that this script will not work if the dataframe contains any column with strings. After removing those columns, this script will work without any problem in generating the last column with geometric mean for every row.
print(df.shape)
(5, 6)
print(df)
#Gene GSM772 GSM773 GSM774 GSM775 GSM776
0 0610007P14Rik 0.003485 0.003415 0.005431 0.003667 0.007146
1 0610009B22Rik 0.001220 0.001351 0.001762 0.001404 0.002177
2 0610009L18Rik 0.000055 0.000009 0.000152 0.000082 0.000179
3 0610009O20Rik 0.006369 0.006830 0.007176 0.006653 0.006907
4 0610010F05Rik 0.008310 0.008329 0.007091 0.006919 0.006915
print(results)
#Gene GSM772 GSM773 GSM774 GSM775 GSM776 GeometricMean
0 0610007P14Rik 0.003485 0.003415 0.005431 0.003667 0.007146 0.004424
1 0610009B22Rik 0.001220 0.001351 0.001762 0.001404 0.002177 0.001548
2 0610009L18Rik 0.000055 0.000009 0.000152 0.000082 0.000179 0.000064
3 0610009O20Rik 0.006369 0.006830 0.007176 0.006653 0.006907 0.006782
4 0610010F05Rik 0.008310 0.008329 0.007091 0.006919 0.006915 0.007484