πŸŽ„πŸŽ„πŸŽ„πŸŽ„πŸŽ„πŸŽ„πŸŽ„πŸŽ„
Because it is large and was, until recently, full of students, my family does a gift exchange pool every year. We used to draw names from a hat, but sometimes people would get themselves or their spouses, and so in a fit of procrastination a few years ago I wrote an R function to generate the gift pool. I shared this function with two fellow nerds, who (I’m assuming also at the expense of more important things) took their own shots at it. A note: both have told me today that they now cringe at this code and feel their current selves could do better, to which I say: tough chestnuts, I have to post this today because reminderbot*. πŸŽ„πŸŽ„πŸŽ„

Per this tweet, I present to you three different versions of functions for generating a gift pool in R.

For this demonstration, we’ll use partners and individuals borrowed from the TV show Modern Family:

Cam & Mitchell
Jay & Gloria 
Manny 
Claire & Phil 

These go into a data frame called pool that we’ll use as inputs to the functions:

( pool <- data.frame( id = c(1, 2, 3, 4, 5, 6, 7), 
                           person = c("Cam", "Mitchell", 
                                      "Jay", "Gloria", 
                                      "Manny",
                                      "Claire", "Phil"
                                      ), 
                           partner = c(2, 1, 4, 3, 5, 7, 6)
                   )
 )
##   id   person partner
## 1  1      Cam       2
## 2  2 Mitchell       1
## 3  3      Jay       4
## 4  4   Gloria       3
## 5  5    Manny       5
## 6  6   Claire       7
## 7  7     Phil       6

…And we’ll set.seed() so you can reproduce the results of this otherwise stochastic process:

set.seed(123)

πŸŽ„ Myfanwy’s Original Version of the xmas() function

The xmas() function is actually three functions doing different steps. mkdf() samples the pool and matches those samples with their person ids; checkdf() checks the pairings; xmas() iterates this process until we get a valid set of draws and pairings.

# Myfanwy's original version: 

mkdf <- function(pooldf = pool) {
  
  id_lookup <- dplyr::select(pooldf, p2 = person, id2 = id) # create a lookup table for partners
  
  p2 <- factor(sample(pooldf$person, nrow(pooldf), replace = FALSE)) # sample the pool
  
  p2_id <- dplyr::left_join(data.frame(p2 = p2), id_lookup) # recover ids
  
  pp <- data.frame(cbind(pooldf, p2_id)) # assemble sampled pool dataframe
  
  return(pp)
}

checkdf <- function(df) {
  
  self <- df[df$person == df$p2, ] # get back any rows where someone drew themselves
  partner <- df[df$partner == df$id2, ] # get back any rows where someone drew their spouse
  
  selfs = nrow(self)
  partners = nrow(partner)
  
  return(sum(selfs, partners)) # sum the instances of invalid pairs
  
}

# while the results of checkdf > 0, keep running mkdf and checkdf until we get a configuration where no one has their spouse or themselves

xmas <- function(pool_df) {
  
  d <- mkdf(pooldf = pool_df)
  d2 <- checkdf(d)
  
  while( d2 > 0) {
    d <- mkdf(pooldf = pool_df)
    d2 <- checkdf(d)
  }
  return(d)
}

And now we call the function and see its return:

xmas(pool_df = pool)
## Joining, by = "p2"
##   id   person partner       p2 id2
## 1  1      Cam       2     Phil   7
## 2  2 Mitchell       1      Jay   3
## 3  3      Jay       4   Claire   6
## 4  4   Gloria       3 Mitchell   2
## 5  5    Manny       5   Gloria   4
## 6  6   Claire       7    Manny   5
## 7  7     Phil       6      Cam   1

At first I didn’t love the dplyr::left_join() messaging but then I realized it was kind of nice knowing how many draws it took the function to get a valid pairs data frame.

This function works - but eventually I’d like to add a couple of enhancements so that 1) folks don’t draw from their nuclear families (Jay & Gloria should be unable to draw Manny and vice-versa), and 2) folks can’t draw the same person they had the previous year.

πŸŽ„πŸŽ„πŸŽ„πŸŽ„πŸŽ„

Matt Espe’s version

πŸŽ„πŸŽ„πŸŽ„πŸŽ„πŸŽ„

Matt took my functions and made a more concise, single function. Note that because he’s using a for-loop, it’s best to pre-allocate a column of NA in the pool dataframe:

#-------------------------------------------------------#
# alternate version (author: Matt Espe)
# this function requires a column of NAs in the starting dataframe:
pool <- data.frame(
    id = c(1:7),
    person = c("Cam", "Mitchell",
               "Jay", "Gloria",
               "Manny",
               "Claire", "Phil"),
    partner = c(2, 1, 4, 3, 5, 7, 6),
    id2 = NA
)

xmas2 = function(pool_df) {
    # for 1:7, gift draw number is sampled from anyone BUT that person's spouse:
    
    for (i in seq(nrow(pool_df))) {
        pool_df$id2[i] = sample(setdiff(seq(nrow(pool_df)), c(pool_df$partner[i], pool_df$id2, i)),
                               1)
    }
    
    pool_df$p2 = pool_df$person[pool_df$id2] # make a column of who each person drew (p2)
    
    ok = all(!duplicated(pool_df$id2) &
                 pool_df$partner != pool_df$id2) # ok is a df where no one drew themselves and no one drew their spouse
    if (!ok)
        # if the conditions aren't met, return "somthing went wrong"
        stop("No good - try again")
    return(pool_df) # else, return the finished matches
}

xmas2(pool_df = pool) # call until you get a return
##   id   person partner id2       p2
## 1  1      Cam       2   7     Phil
## 2  2 Mitchell       1   6   Claire
## 3  3      Jay       4   2 Mitchell
## 4  4   Gloria       3   5    Manny
## 5  5    Manny       5   1      Cam
## 6  6   Claire       7   4   Gloria
## 7  7     Phil       6   3      Jay

πŸŽ„πŸŽ„πŸŽ„πŸŽ„πŸŽ„

Travis Hinkelman’s version

πŸŽ„πŸŽ„πŸŽ„πŸŽ„πŸŽ„

Travis’s version creates all possible combinations of giver/receivers and then drops the β€œforbidden” ones:

library(tidyverse)

d <- tibble(
  participant = c("Cam", "Mitchell", "Jay", "Gloria", "Manny", "Claire", "Phil"),
  spouse = c("Mitchell", "Cam", "Gloria", "Jay", NA, "Phil", "Claire"),
  couple = paste(participant, spouse, sep = "_")
)

xmas3 <- function(data) {
  # create all possible combinations with crossing()
  all <- crossing(giver = data$participant, receiver = data$participant) %>% 
    mutate(giver_receiver = paste(giver, receiver, sep = "_")) %>%
    # drop forbidden combinations
    filter(giver != receiver & !(giver_receiver %in% data$couple)) %>%  
    # drop giver_receiver because no longer needed
    select(-giver_receiver)                                                   
  
  out <- tibble(giver = NA, receiver = NA)
  for (i in data$participant) {
    # loop through each participant and exclude any participants already selected as receivers
    all_sub <- filter(all, giver == i & !(receiver %in% out$receiver))
    if (nrow(all_sub) > 0) {
      # randomly select one row from the subsetted data frame and add it to the output data frame
      out <- bind_rows(out, sample_n(all_sub, 1)) %>% filter(!is.na(giver)) 
      # tibble was initialized with NA values [instead of out <- tibble()] 
      # because need to check for receiver on first pass through the loop
    } else {
      stop("No good - try again")
    }
  }
  return(out)
}

xmas3(d)
## # A tibble: 7 x 2
##   giver    receiver
##   <chr>    <chr>   
## 1 Cam      Phil    
## 2 Mitchell Jay     
## 3 Jay      Manny   
## 4 Gloria   Cam     
## 5 Manny    Claire  
## 6 Claire   Gloria  
## 7 Phil     Mitchell

So there you have it - three options for meeting all your gift pool needs. If anyone has their own versions to add, ping me on Twitter and I’ll add them to this post in a few weeks, when I have not one but two manuscript review deadlines.

πŸŽ„πŸŽ„πŸŽ„πŸŽ„πŸŽ„




*This is a thing?! I have never felt so compelled to a deadline in my life. No one tell my supervisors about reminderbots please

πŸŽ„πŸŽ„πŸŽ„πŸŽ„πŸŽ„