I’m really sorry for the boring title, but I wanted the next person who searches for anything resembling those phrases to find this post, because I came up pretty empty looking for something like it earlier today.

I often find myself writing/needing functions that manipulate dataframes and output either a new dataframe or a plot. To give myself some practice writing functions that output ggplot2 plots, I wrote a function with the babynames package that would plot a single name over time. The function was simply:


flashback <- function(name, sex) {
 
  x <- filter(babynames, name == name, sex == sex)
 
  p <-  ggplot(x, aes(x = year, y = n))
 
  p <- p + geom_line(color="hotpink", size = 3) +
           geom_area(aes(alpha = 0.5)) +
           theme(legend.position ="none") +
           labs(list(x = "Year", y = "n")) +
           ggtitle(paste(c(deparse(substitute(name))), "Over the Years"))
 
  print(p)
 
}


But running flashback(“Daphne”, “F”) takes an inordinately long time (when it doesn’t break my R session entirely) and yields a wonky-looking plot. No good.

Running everything separately, outside the function, works just fine and produces the proper plot:

x <- filter(babynames, name == 'Daphne', sex == 'F')
 
  p <-  ggplot(x, aes(x = year, y = n))
 
  p <- p + geom_line(color="hotpink", size = 3) +
           geom_area(aes(alpha = 0.5)) +
           theme(legend.position ="none") +
           labs(list(x = "Year", y = "n")) +
           ggtitle(paste("Daphne Over the Years"))
 
  print(p)



So what’s going on? A friend of mine suggested it might have something to do with my use of dplyr’s “filter” function within the flashback function, as dplyr::filter uses non-standard evaluation (NSE) and its output can be unpredictable. Re-writing the function to avoid NSE gives us:

flashback <- function(name, sex) {
  
  x <- babynames[name == babynames$name & sex == babynames$sex , ]
  # x <- babynames[babynames$name == "Daphne" & babynames$sex == "F" , ] # function testing
  
  p <-  ggplot(x, aes(x = year, y = n))
  
  p <- p + geom_line(color="hotpink", size = 3) + geom_area(aes(alpha = 0.5)) + theme(legend.position ="none") +
    labs(list(x = "Year", y = "n")) + ggtitle(paste(c(deparse(substitute(name))), "Over the Years"))
  
  print(p)
  
}

flashback("Daphne", "F")


This version of the function returns a non-wonky plot every time. That said, if anyone knows of successful uses of NSE functions within other functions (or if you’ve caught something else in this code that might be problematic), I would absolutely love to hear about it.