I'm trying to do a scatterplot between this 2 variables, but it gives me this error.
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (18,) + inhomogeneous part.
def plot():
Age_list=[]
Lactate_list=[]
for l in range(1,19):
Age = test[test.ID == l]['age']
Lactate = (test[test.ID == l]['VO2'].nlargest(n=5).mean())*(80/100)
Lactate_list.append(Lactate)
Age_list.append(Age)
plt.scatter(Age_list, Lactate_list,color='purple')
a, b = np.polyfit(Age_list, Lactate_list, 1)
plt.plot(Age_list, a*np.array(Age_list)+b)
plt.xlabel('Age')
plt.ylabel('Lactate threshold')
plt.title('Correlation between Age and Lactate threshold')
plt.show()
If i print then length of Age_list and Lactate_list it gives the same length. I don't understand what is the problem. The lactate is 80% of what is inside the parenthesis. Is it ok how I did it?
The error is being thrown because the elements of the Age_list and Lactate_list are not arrays with the same shape. The elements of the Age_list are series, while the elements of the Lactate_list are scalars.
try this instead
def plot():
age_list = []
lactate_list = []
for l in range(1, 19):
age = test[test.ID == l]['age'].values[0]
lactate = (test[test.ID == l]['VO2'].nlargest(n=5).mean())*(80/100)
lactate_list.append(lactate)
age_list.append(age)
plt.scatter(age_list, lactate_list, color='purple')
a, b = np.polyfit(age_list, lactate_list, 1)
plt.plot(age_list, a*np.array(age_list) + b)
plt.xlabel('Age')
plt.ylabel('Lactate Threshold')
plt.title('Correlation between Age and Lactate Threshold')
plt.show()