rdplyrrchartsdata-manipulation

Adding zero valued entries so that all groups have entries for the same items


I'm trying to use Rcharts to create a stacked bar chart across a number of recorded regions (stacking separate group values on top of each other). The data is in a format similar to below.

Region | Group | Value
----------------------
USA    |   A   |   5
USA    |   B   |   3
USA    |   C   |   1
UK     |   A   |   4
UK     |   B   |   6
France |   C   |   3

Using the below code produces a grouped bar chart which works fine. However the stacked button does nothing to change the plot.

nPlot(Value ~ Region, group = 'Group', 
      data = example_data, 
      type = 'multiBarChart')

Looking at this thread it seems the problem might be that some Regions don't have entries for all present groups (e.g. the UK lacks an entry for C, and France lacks entries for A and B).

What I'm not sure of is how to add entries with Value == 0 so that all Regions have an entry for every present Group. So that the above data is transformed to this (the ordering of entries doesn't matter).

Region | Group | Value
----------------------
USA    |   A   |   5
USA    |   B   |   3
USA    |   C   |   1
UK     |   A   |   4
UK     |   B   |   6
UK     |   C   |   0
France |   A   |   0
France |   B   |   0
France |   C   |   3

This will ultimately be placed within the reactive component of a Shiny app so efficient solutions in particular would be great


Solution

  • We can use complete() from the tidyr package:

    This is a wrapper around expand(), left_join() and replace_na that's useful for completing missing combinations of data. It turns implicitly missing values into explicitly missing values.

    library(tidyr)
    library(rCharts)
    
    df %>% 
      complete(Region, Group, fill = list(Value = 0)) %>%
      nPlot(Value ~ Region, group = 'Group', 
            data = ., 
            type = 'multiBarChart')
    

    Grouped

    enter image description here

    Stacked

    enter image description here


    Data

    df <- structure(list(Region = structure(c(3L, 3L, 3L, 2L, 2L, 1L), .Label = c("France", 
    "UK", "USA"), class = "factor"), Group = structure(c(1L, 2L, 
    3L, 1L, 2L, 3L), .Label = c("A", "B", "C"), class = "factor"), 
        Value = c(5L, 3L, 1L, 4L, 6L, 3L)), .Names = c("Region", 
    "Group", "Value"), class = "data.frame", row.names = c(NA, -6L))