- 
                Notifications
    You must be signed in to change notification settings 
- Fork 2.1k
Description
I found a problem with the ggplot2-in-packages vignette.
I expected it to cover how to fully replace all behaviors from the now-deprecated aes_string() function. However, both of its recommended solutions (the .data pronoun and the embrace operator) have limitations that make them difficult to use to replace aes_string(), and the vignette does not cover how to work around these limitations.
Here is some test data for demonstrations:
set.seed(1)
plotdat <- data.frame(height=rnorm(10,mean=160,sd=10),weight=rnorm(10,mean=70,sd=10))
plotdat$sex <- sample(c('F','M'),10,replace=T)
plotdat$age <- round(rnorm(10,mean=40,sd=20))
plotdat$employment <- sample(c('Employed','Unemployed','Retired'),10,replace=T)The problem with using the .data pronoun is that the .data pronoun cannot handle situations where an aesthetic is set to NULL to indicate it should not be used:
plot_data_pronoun <- function(dat,xvar,yvar,colorvar=NULL,shapevar=NULL) {
  myplot <- ggplot(dat,aes(x=.data[[xvar]],y=.data[[yvar]],color=.data[[colorvar]],shape=.data[[shapevar]])) +
    geom_point()
  return(myplot)
}
plot_data_pronoun(plotdat,'weight','height',colorvar='employment',shapevar=NULL)
# This fails with:
# Error in `geom_point()`:
#  ! Problem while computing aesthetics.
# Error occurred in the 1st layer.
# Caused by error in `.data[[NULL]]`:
#  ! Must subset the data pronoun with a string, not `NULL`.The embrace operator can handle NULL, but the problem with using the embrace operator is that it does not work when the variable name (or NULL) that will be used for an aesthetic is determined when the code runs and cannot be written into the function call:
plot_embrace <- function(dat,xvar,yvar,colorvar=NULL,shapevar=NULL) {
  myplot <- ggplot(dat,aes(x={{ xvar }},y={{ yvar }},color={{ colorvar }},shape={{ shapevar }})) +
    geom_point()
  return(myplot)
}
whichvar <- sample(c('sex','employment'),1)
plot_embrace(plotdat,weight,height,colorvar=age,shapevar=whichvar) # This produces an incorrect plot that does not use the column specified by "whichvar"Both of these scenarios could be handled easily and seamlessly by aes_string():
plot_aes_string <- function(dat,xvar,yvar,colorvar=NULL,shapevar=NULL) {
  myplot <- ggplot(dat,aes_string(x=xvar,y=yvar,color=colorvar,shape=shapevar)) + geom_point()
  return(myplot)
}
# Scenario 1: Passing NULL for an unused aesthetic
plot_aes_string(plotdat,'weight','height',colorvar='employment',shapevar=NULL) # This works
# Scenario 2: The variable used for shape isn't determined until the code runs
whichvar <- sample(c('sex','employment'),1)
plot_aes_string(plotdat,'weight','height',colorvar='age',shapevar=whichvar) # This works
# Scenario 3: Both things at once
whichvar <- NULL
plot_aes_string(plotdat,'weight','height',colorvar='age',shapevar=whichvar) # This worksPlease consider updating the ggplot2-in-packages vignette to demonstrate how tidy evaluation idioms can be used to allow a function which uses ggplot() to handle both of these scenarios.
(I am submitting this as a bug report rather than a StackExchange question because, while I can figure out a duct-tape work around for these issues, I think it is important that the ggplot authors' intended/best practice solution to these issues be included in the official documentation. Or if there is currently no best practice solution that can handle both scenarios, there may be a need to create one to help users move away from aes_string().)