rcountgrepl

Using grep with count_if (EXPSS package in R)


I'm trying to count instances where a certain string appears in a dataframe (this will be a sub-string, i.e. "blue" will appear within a larger block of text), and then summarize those counts by another field.

Here's the code:

    totals_by_county <- county_data %>%
      group_by(county_data$county)%>%
      summarise(number_occurences = count(grepl('blue', county_data$color,ignore.case = TRUE))) 
    totals_by_county

And I get this error:

no applicable method for 'summarise_' applied to an object of class "logical"

Is there a way to do this in the method I'm trying to use above?


Solution

  • With grepl:

    totals_by_county <- county_data %>%
        group_by(county) %>%
        summarise(number_occurences = sum(grepl('blue', color, ignore.case = TRUE))) 
    

    or, with count_if from expss:

    totals_by_county <- county_data %>%
        group_by(county) %>%
        summarise(number_occurences = count_if(perl('blue', ignore.case = TRUE), color)) 
    

    UPDATE with reproducible example:

    library(dplyr)
    library(expss)
    
    county_data = data.frame(
        county = c("A", "A", "A", "B", "B"),
        color = c("blue-blue", "red", "orange-blue", "yellow", "green"),
        stringsAsFactors = FALSE)
    
    
    county_data %>%
        group_by(county) %>%
        summarise(number_occurences = count_if(perl('blue', ignore.case = TRUE), color)) 
    
    # A tibble: 2 x 2
    # county number_occurences
    # <chr>              <int>
    # 1 A                  2
    # 2 B                  0