I have a sparse data frame with more than 500 columns. I want to remove the columns having sum of entries less than a threshold value say 100. How can I do this in Python?
In R I can achieve this using:
df2 <- df51[,colSums(df51) >= 100]
In python, that translates to
df2 = df1.drop(df1.columns[df1.sum() >= 100], axis=1)
The axis=1 option is for dropping columns while axis=0 is for rows.