I have a df that looks like:
CD1 CD2 CD3 ... FG1 FG2
0 3.8 2.9 0 ... 0.1 0.1
1 0.1 0 4.1 ... 5.2 0
# 35 columns and 2 rows
And I plot a stacked bar chart using:
colors = plt.cm.jet(np.linspace(0, 1, 35))
df3.plot(kind='barh',stacked=True, figsize=(15,10),color=colors, width=0.08)
But my issue is that this plots all 35 columns however I want to only plot the n columns with the highest values e.g. only plot CD1 and CD2 for row 0 and CD3 and FG1 for row 1...
CD1 CD2 CD3 ... FG1 FG2
0 3.8 2.9 - ... - -
1 - - 4.1 ... 5.2 -
Is there a way to do this?
If I understand what you're asking for... It seems you can accomplish this by getting the max for each column followed by nlargest to pick the top 10 columns:
df.max().nlargest(10)
The result should be a Series indexed by column names, so it should be easy to plot that data.