rfor-looppurrrsplit-apply-combinepmap

split apply combine w/ function or purrr package pmap?


This is a big problem for me to solve. If I had enough reputation to award a bounty I would!

Looking to balance territories of accounts of sales reps. I have the process broken up, and I don't really know how to do it across each region.

In this example there are 1000 accounts across 4 regions, each region with 2 subset Leagues, and then various owners of the accounts -- Some accounts are unowned. Each account has a random value between 1,000 and 100,000.

reproducible example:

Account List:

set.seed(1)
Accounts <- paste0("Acc", 1:1000)
Region <- c("NorthEast", "SouthEast", "MidWest", "West")
League <- sample(c("Majors", "Minors"), 1000, replace = TRUE)
AccValue <- sample(1000:100000, 1000, replace = TRUE)
Owner <- sample(c("Chad", NA, "Jimmy", "Adrian", NA, NA, "Steph", "Matt", "Jared", "Eric"), 1000, replace = TRUE)
AccDF <- data.frame(Accounts, Region, League, AccValue, Owner)
AccDF$Accounts <- as.character(AccDF$Accounts)
AccDF$Region <- as.character(AccDF$Region)
AccDF$League <- as.character(AccDF$League)
AccDF$Owner <- as.character(AccDF$Owner)

Summary of Ownership in region:

Summary <- AccDF %>%
  group_by(Region, League, Owner) %>%
  summarise(Count = n(),
            TotalValue = sum(AccValue))

Summary by Region, League:

Summary2 <- AccDF %>%
  group_by(Region, League) %>%
  summarise(Count = n(),
            TotalValue = sum(AccValue),
            AccountsPerRep = round(Count / 7, 0),
            ValuePerRep = TotalValue / 7)

That is all of the starting data, and I would like to do the following process to each grouping of the Summary2 table.

West Minors Example:

Total West Minors Accounts: 120

#break out into owned and unowned

WestMinorsOwned <- AccDF %>%
  filter(Region == "West",
         League == "Minors",
         !is.na(Owner))

WestMinorsUnowned <- AccDF %>%
  filter(Region == "West",
         League == "Minors",
         is.na(Owner))

#unassign accounts until threshold is hit

New.WestMinors <- WestMinorsOwned %>% 
  mutate(r = runif(n())) %>% 
  arrange(r) %>% 
  group_by(Owner) %>% 
  mutate(NewOwner = replace(Owner, cumsum(AccValue) > 600000 | row_number() > 14, NA)) %>% 
  ungroup(Owner) %>%
  mutate(Owner = NewOwner) %>%
  select(-r, -NewOwner)

After the Owner has been updated we bind back together the pieces to have the WestMinors Account Base, all with updated owners, hopefully balanced.

AssignableWestMinors <- bind_rows(filter(AccDF, Region == "West" & League == "Minors" & is.na(Owner)), 
                                  filter(New.WestMinors, is.na(Owner))) %>%
  arrange(desc(AccValue))

#check work
OwnerSummary <- New.WestMinors %>%
  filter(!is.na(Owner)) %>%
  group_by(Region, League, Owner) %>%
  summarise(Count = n(), TotalValue = sum(AccValue))

No one has more than 14 accounts or 600,000, so we're in a good place to start reassigning the unowned accounts to try to balance everyone together. The following for-loop looks at each name in the OwnerSummary for who has the smallest $$ assigned to them and assigns the most valueable account, and then moves through each account, attempting to balance each owner's share.

#Balance Unassigned

for (i in 1:nrow(AssignableWestMinors)){
  idx <- which.min(OwnerSummary$TotalValue)
  OwnerSummary$TotalValue[idx] <- OwnerSummary$TotalValue[idx] + AssignableWestMinors$AccValue[i]
  OwnerSummary$Count[idx] <- OwnerSummary$Count[idx] + 1
  AssignableWestMinors$Owner[i] <- as.character(OwnerSummary$Owner[idx])}

Now we just bind together the previously owned, and the newly assigned, and we have our finished balanced West Minors territory.

WestMinors.Final <- bind_rows(filter(New.WestMinors, !is.na(Owner)), AssignableWestMinors)

WM.Summary <- WestMinors.Final %>%
  group_by(Region, League, Owner) %>%
  summarise(Count = n(),
            TotalValue = sum(AccValue))

Everyone has a similar number of accounts, and the total $$ territory is all within reason.

Now I'm trying to do that for each grouping of the original 4 regions, 2 leagues. So doing this 8 times and then stitch it all together. Each subgroup has a different threshold for $$ value to aim for, and # of accounts as well. How can I break apart the original account base into 8 sections, apply all of this, and then combine it back together?


Solution

  • You should take advantage of ?dplyr::do to do the split-apply-combine operation that you want on subsets of Region-League. First, functionize your logic so that it can operate on a data frame dta which represents a subsetted version of the master dataframe AccDF.

    reAssign <- function(dta) {
      other_acct <- dta %>% 
        filter(!is.na(Owner)) %>% 
        mutate(r = runif(n())) %>% 
        arrange(r) %>% 
        group_by(Owner) %>% 
        mutate(NewOwner = replace(Owner, cumsum(AccValue) > 600000 | row_number() > 14, NA)) %>% 
        ungroup(Owner) %>%
        mutate(Owner = NewOwner) %>%
        select(-r, -NewOwner)
    
      assignable_acct <- other_acct %>% 
        filter(is.na(Owner)) %>% 
        bind_rows( filter(dta, is.na(Owner)) ) %>% 
        arrange(desc(AccValue))
    
      acct_summary <- other_acct %>%
        filter(!is.na(Owner)) %>%
        group_by(Owner) %>%
        summarise(Count = n(), TotalValue = sum(AccValue))
    
      # I have a feeling there's a much better way of doing this, but oh well...  
      for (i in seq(nrow(assignable_acct))) {
        idx <- which.min(acct_summary$TotalValue)
        acct_summary$TotalValue[idx] <- acct_summary$TotalValue[idx] + assignable_acct$AccValue[i]
        acct_summary$Count[idx] <- acct_summary$Count[idx] + 1
        assignable_acct$Owner[i] <- as.character(acct_summary$Owner[idx])
      }
      final <- other_acct %>% 
        filter(!is.na(Owner)) %>% 
        bind_rows(assignable_acct)
    
      return(final)
    }
    

    Then simply apply it to AccDF that has been grouped by Region, League.

    new_master <- AccDF %>% 
      group_by(Region, League) %>% 
      do( reAssign(.) ) %>% 
      ungroup() 
    

    Checking to make sure it's done it's job...

    new_master %>% 
      group_by(Region, League, Owner) %>%
      summarise(Count = n(),
              TotalValue = sum(AccValue)) %>% 
      as.data.frame()