rdataframefunctiondplyrmutated

How to work with two data frames that interact with each other inside one function in R?


I get the below output when I run my custom function runLabel() against starting data frame myDF, generated via the code immediately beneath. The code generates two data frames: Label and selGrpCode, and Label needs to be manipulated by values in selGrpCode resulting in a new matchOnes column added (mutated) to the Label data frame. But the matchOnes column does not appear in the Label data frame.

> runLabel(myDF)
  Element Group eleCnt eleGrpCnt grpPrefix subGrpRnk grpCode
1       R     0      1         0         0         0     0.0
2       R     0      2         0         0         0     0.0
3       B     0      1         0         0         0     0.0
4       R     0      3         0         0         0     0.0
5       X     1      1         1         1         1     1.1
6       X     1      2         1         2         2     2.2
  grpCode
1     1.1
2     2.2

library(dplyr)

myDF <- data.frame(
    Element = c("R","R","B","R","X","X"),
    Group = c(0,0,0,0,1,1)
  )

runLabel <- function(x) {Label <- x %>%
  group_by(Element) %>%
    mutate(eleCnt = row_number()) %>%
    mutate(eleGrpCnt = ifelse(Group != 0,match(Group, unique(Group)),0)) %>%
  ungroup() %>%
  mutate(grpPrefix = eleCnt * eleGrpCnt) %>%
  mutate(subGrpRnk = ifelse(Group > 0, sapply(1:n(), function(x) sum(Element[1:x]==Element[x] & Group[1:x] == Group[x])),0)) %>%
  mutate(grpCode = as.numeric(paste(grpPrefix,subGrpRnk, sep = '.')))

selGrpCode <- Label %>% distinct(grpCode) %>% select(grpCode) %>% filter(grpCode > 0) %>% arrange(grpCode)

Label %>% mutate(matchOnes = ifelse(eleCnt < min(selGrpCode),eleCnt,0))

print.data.frame(Label)
print.data.frame(selGrpCode)
}

This is how Label data frame should present (adding the missing column to the right):

  Element Group eleCnt eleGrpCnt grpPrefix subGrpRnk grpCode matchOnes
1       R     0      1         0         0         0     0.0         1
2       R     0      2         0         0         0     0.0         0
3       B     0      1         0         0         0     0.0         1
4       R     0      3         0         0         0     0.0         0
5       X     1      1         1         1         1     1.1         1
6       X     1      2         1         2         2     2.2         0 

How do I adjust the function so that two data frames can be run and interact? Using dplyr where possible. Please don't simplify by reducing this to one data frame, because this code is a simplification of more involved code and needs to run two data frames.


Solution

  • You have to save your Label variable at the end, otherwise the mutation is not printed from your function:

    runLabel <- function(x) {
          ...
          Label <- Label %>% mutate(matchOnes = ifelse(eleCnt < min(selGrpCode),eleCnt,0))
          ...
    }
    
    

    And here your full code:

    library(dplyr)
    
    myDF <- data.frame(
      Element = c("R","R","B","R","X","X"),
      Group = c(0,0,0,0,1,1)
    )
    
    runLabel <- function(x) {
      Label <- x %>%
      group_by(Element) %>%
      mutate(eleCnt = row_number()) %>%
      mutate(eleGrpCnt = ifelse(Group != 0,match(Group, unique(Group)),0)) %>%
      ungroup() %>%
      mutate(grpPrefix = eleCnt * eleGrpCnt) %>%
      mutate(subGrpRnk = ifelse(Group > 0, sapply(1:n(), function(x) sum(Element[1:x]==Element[x] & Group[1:x] == Group[x])),0)) %>%
      mutate(grpCode = as.numeric(paste(grpPrefix,subGrpRnk, sep = '.')))
    
    selGrpCode <- Label %>% distinct(grpCode) %>% select(grpCode) %>% filter(grpCode > 0) %>% arrange(grpCode)
    
    Label <- Label %>% mutate(matchOnes = ifelse(eleCnt < min(selGrpCode),eleCnt,0))
    
    print.data.frame(Label)
    print.data.frame(selGrpCode)
    }
    
    runLabel(myDF)
    #>   Element Group eleCnt eleGrpCnt grpPrefix subGrpRnk grpCode matchOnes
    #> 1       R     0      1         0         0         0     0.0         1
    #> 2       R     0      2         0         0         0     0.0         0
    #> 3       B     0      1         0         0         0     0.0         1
    #> 4       R     0      3         0         0         0     0.0         0
    #> 5       X     1      1         1         1         1     1.1         1
    #> 6       X     1      2         1         2         2     2.2         0
    #>   grpCode
    #> 1     1.1
    #> 2     2.2
    

    Created on 2022-09-12 with reprex v2.0.2