rggplot2

Plot feature importance computed by Ranger function


I need to plot variable Importance using ranger function because I have a big data table and randomForest doesn't work in my case of study.

This is my code :

library(ranger)
set.seed(42)
model_rf <- ranger(Sales ~ .,data = data[,-1],importance = "impurity")

Then I create new data frame DF which contains from the code above like this

> v<-as.vector(model_rf$variable.importance$Importance)
> w<-(as.vector((row.names(df))))
> DF<-cbind(w,v)
> DF<-as.data.frame(DF)
> DF
                           w                v
1                  DayOfWeek 376393213095.426
2                  Customers 1364058809531.96
3                       Open 634528877741.021
4                      Promo 261749509069.205
5               StateHoliday 5196666310.34041
6              SchoolHoliday  6522969049.3763
7                   DateYear  7035399071.0376
8                  DateMonth 20134820116.2625
9                    DateDay 37631766745.2306
10                  DateWeek 32834962167.9479
11                 StoreType 31568433413.5718
12                Assortment 20257406597.8358
13       CompetitionDistance  111847579772.77
14 CompetitionOpenSinceMonth 46332196019.0118
15  CompetitionOpenSinceYear 45548903472.6485
16                    Promo2                0
17           Promo2SinceWeek 50666744628.7906
18           Promo2SinceYear 40964066303.0407
19           CompetitionOpen 39927447341.0351
20                 PromoOpen  28319356095.063
21            IspromoinSales 2844220121.08598

But I need to plot a graph like this according to the result shown above:

enter image description here

EDIT

As @Sam proposed I tried to adapt this code:

> ggplot(DF, aes(x=reorder(w,v), y=v,fill=v))+ 
+   geom_bar(stat="identity", position="dodge")+ coord_flip()+
+   ylab("Variable Importance")+
+   xlab("")+
+   ggtitle("Information Value Summary")+
+   guides(fill=F)+
+   scale_fill_gradient(low="red", high="blue")

But I get this error:

Error: Discrete value supplied to continuous scale In addition: There were 42 warnings (use warnings() to see them) >

How can I do this, please? Thank you in advance!


Solution

  • This is untested but I think this should give you what you are after:

    ggplot(
        enframe(
            model_rf$variable.importance,
            name = "variable",
            value = "importance"
        ),
        aes(
            x = reorder(variable, importance),
            y = importance,
            fill = importance
        )
    ) +
        geom_bar(stat = "identity", position = "dodge") +
        coord_flip() +
        ylab("Variable Importance") +
        xlab("") +
        ggtitle("Information Value Summary") +
        guides(fill = "none") +
        scale_fill_gradient(low = "red", high = "blue")