🎄🎄🎄🎄🎄🎄🎄🎄
Because it is large and was, until recently, full of students, my family does a gift exchange pool every year. We used to draw names from a hat, but sometimes people would get themselves or their spouses, and so in a fit of procrastination a few years ago I wrote an R function to generate the gift pool. I shared this function with two fellow nerds, who (I’m assuming also at the expense of more important things) took their own shots at it. A note: both have told me today that they now cringe at this code and feel their current selves could do better, to which I say: tough chestnuts, I have to post this today because reminderbot*. 🎄🎄🎄
Per this tweet, I present to you three different versions of functions for generating a gift pool in R.
For this demonstration, we’ll use partners and individuals borrowed from the TV show Modern Family:
Cam & Mitchell
Jay & Gloria
Manny
Claire & Phil
These go into a data frame
called pool
that we’ll use as inputs to the functions:
( pool <- data.frame( id = c(1, 2, 3, 4, 5, 6, 7),
person = c("Cam", "Mitchell",
"Jay", "Gloria",
"Manny",
"Claire", "Phil"
),
partner = c(2, 1, 4, 3, 5, 7, 6)
)
)
## id person partner
## 1 1 Cam 2
## 2 2 Mitchell 1
## 3 3 Jay 4
## 4 4 Gloria 3
## 5 5 Manny 5
## 6 6 Claire 7
## 7 7 Phil 6
…And we’ll set.seed()
so you can reproduce the results of this otherwise stochastic process:
set.seed(123)
xmas()
functionThe xmas()
function is actually three functions doing different steps. mkdf()
samples the pool and matches those samples with their person ids; checkdf()
checks the pairings; xmas()
iterates this process until we get a valid set of draws and pairings.
# Myfanwy's original version:
mkdf <- function(pooldf = pool) {
id_lookup <- dplyr::select(pooldf, p2 = person, id2 = id) # create a lookup table for partners
p2 <- factor(sample(pooldf$person, nrow(pooldf), replace = FALSE)) # sample the pool
p2_id <- dplyr::left_join(data.frame(p2 = p2), id_lookup) # recover ids
pp <- data.frame(cbind(pooldf, p2_id)) # assemble sampled pool dataframe
return(pp)
}
checkdf <- function(df) {
self <- df[df$person == df$p2, ] # get back any rows where someone drew themselves
partner <- df[df$partner == df$id2, ] # get back any rows where someone drew their spouse
selfs = nrow(self)
partners = nrow(partner)
return(sum(selfs, partners)) # sum the instances of invalid pairs
}
# while the results of checkdf > 0, keep running mkdf and checkdf until we get a configuration where no one has their spouse or themselves
xmas <- function(pool_df) {
d <- mkdf(pooldf = pool_df)
d2 <- checkdf(d)
while( d2 > 0) {
d <- mkdf(pooldf = pool_df)
d2 <- checkdf(d)
}
return(d)
}
And now we call the function and see its return:
xmas(pool_df = pool)
## Joining, by = "p2"
## id person partner p2 id2
## 1 1 Cam 2 Phil 7
## 2 2 Mitchell 1 Jay 3
## 3 3 Jay 4 Claire 6
## 4 4 Gloria 3 Mitchell 2
## 5 5 Manny 5 Gloria 4
## 6 6 Claire 7 Manny 5
## 7 7 Phil 6 Cam 1
At first I didn’t love the dplyr::left_join()
messaging but then I realized it was kind of nice knowing how many draws it took the function to get a valid pairs data frame.
This function works - but eventually I’d like to add a couple of enhancements so that 1) folks don’t draw from their nuclear families (Jay & Gloria should be unable to draw Manny and vice-versa), and 2) folks can’t draw the same person they had the previous year.
🎄🎄🎄🎄🎄
🎄🎄🎄🎄🎄
Matt took my functions and made a more concise, single function. Note that because he’s using a for-loop, it’s best to pre-allocate a column of NA
in the pool
dataframe:
#-------------------------------------------------------#
# alternate version (author: Matt Espe)
# this function requires a column of NAs in the starting dataframe:
pool <- data.frame(
id = c(1:7),
person = c("Cam", "Mitchell",
"Jay", "Gloria",
"Manny",
"Claire", "Phil"),
partner = c(2, 1, 4, 3, 5, 7, 6),
id2 = NA
)
xmas2 = function(pool_df) {
# for 1:7, gift draw number is sampled from anyone BUT that person's spouse:
for (i in seq(nrow(pool_df))) {
pool_df$id2[i] = sample(setdiff(seq(nrow(pool_df)), c(pool_df$partner[i], pool_df$id2, i)),
1)
}
pool_df$p2 = pool_df$person[pool_df$id2] # make a column of who each person drew (p2)
ok = all(!duplicated(pool_df$id2) &
pool_df$partner != pool_df$id2) # ok is a df where no one drew themselves and no one drew their spouse
if (!ok)
# if the conditions aren't met, return "somthing went wrong"
stop("No good - try again")
return(pool_df) # else, return the finished matches
}
xmas2(pool_df = pool) # call until you get a return
## id person partner id2 p2
## 1 1 Cam 2 7 Phil
## 2 2 Mitchell 1 6 Claire
## 3 3 Jay 4 2 Mitchell
## 4 4 Gloria 3 5 Manny
## 5 5 Manny 5 1 Cam
## 6 6 Claire 7 4 Gloria
## 7 7 Phil 6 3 Jay
🎄🎄🎄🎄🎄
🎄🎄🎄🎄🎄
Travis’s version creates all possible combinations of giver/receivers and then drops the “forbidden” ones:
library(tidyverse)
d <- tibble(
participant = c("Cam", "Mitchell", "Jay", "Gloria", "Manny", "Claire", "Phil"),
spouse = c("Mitchell", "Cam", "Gloria", "Jay", NA, "Phil", "Claire"),
couple = paste(participant, spouse, sep = "_")
)
xmas3 <- function(data) {
# create all possible combinations with crossing()
all <- crossing(giver = data$participant, receiver = data$participant) %>%
mutate(giver_receiver = paste(giver, receiver, sep = "_")) %>%
# drop forbidden combinations
filter(giver != receiver & !(giver_receiver %in% data$couple)) %>%
# drop giver_receiver because no longer needed
select(-giver_receiver)
out <- tibble(giver = NA, receiver = NA)
for (i in data$participant) {
# loop through each participant and exclude any participants already selected as receivers
all_sub <- filter(all, giver == i & !(receiver %in% out$receiver))
if (nrow(all_sub) > 0) {
# randomly select one row from the subsetted data frame and add it to the output data frame
out <- bind_rows(out, sample_n(all_sub, 1)) %>% filter(!is.na(giver))
# tibble was initialized with NA values [instead of out <- tibble()]
# because need to check for receiver on first pass through the loop
} else {
stop("No good - try again")
}
}
return(out)
}
xmas3(d)
## # A tibble: 7 x 2
## giver receiver
## <chr> <chr>
## 1 Cam Phil
## 2 Mitchell Jay
## 3 Jay Manny
## 4 Gloria Cam
## 5 Manny Claire
## 6 Claire Gloria
## 7 Phil Mitchell
So there you have it - three options for meeting all your gift pool needs. If anyone has their own versions to add, ping me on Twitter and I’ll add them to this post in a few weeks, when I have not one but two manuscript review deadlines.
🎄🎄🎄🎄🎄
*This is a thing?! I have never felt so compelled to a deadline in my life. No one tell my supervisors about reminderbots please
🎄🎄🎄🎄🎄