9.3 Professional formatting

Formatting is what differentiates an exploratory graph from one you would present to others. The customizations available via ggplot() are extensive. From color choices to marker shapes to line widths, you have options. Here, we will focus on a few key elements: descriptive text (e.g. titles), axis scales, and legends.

9.3.1 Titles and labels

Within the different plotting functions, you can specify the main title (ggtitle()), axis labels (e.g. xlab()), and more. As with other ggplot() elements, you add these with the plus sign (+). Choose simple, descriptive labels and titles.

# Scatter plot with title and axis labels
ggplot(data=dcps,aes(x=ProfMath,y=ProfLang)) +
  geom_point() +
  ggtitle("Math and Language Proficiency in DC Public Schools (2018)") +
  xlab("Grade-level proficient in Math (% tested)") +
  ylab("Grade-level proficient in Language (% tested)")

9.3.2 Axis options

Sometimes the axis scales that R chooses don’t make sense or fail to communicate effectively. In these cases, we want to format the endpoints, or limits, of each axis scale. The command scale_x_continuous() and scale_y_continuous() allow you to customize labeling and limits for each axis. As you might guess, these commands are intended for continuous variables. Parallel commands exist for other types, including scale_x_discrete(), scale_x_date(), and scale_x_binned(). Others support transformations of variables for display, like scale_x_log10(). Refer to the ggplot() documentation for more details.

In the example below, the arguments limits is used to set the minimum and maximum values displayed on each axis. Note that the limits need to be specified as a vector, e.g. by using the concatenate function c(). The argument breaks indicates which values are labeled on each axis. This argument also takes a vector. Here, we used the function seq() to create a vector of all values from 0 to 100 in increments of 10.

# Scatter plot with title and axis labels
ggplot(data=dcps,aes(x=ProfMath,y=ProfLang)) +
  geom_point() +
  ggtitle("Math and Language Proficiency in DC Public Schools (2018)") +
  xlab("Grade-level proficient in Math (% tested)") +
  ylab("Grade-level proficient in Language (% tested)") +
  scale_x_continuous(limits = c(0,100),breaks=seq(0,100,10)) +
  scale_y_continuous(limits = c(0,100),breaks=seq(0,100,10))

9.3.3 Legend options

When using the colour or fill aesthetics, ggplot() will automatically create a legend. You can manipulate the display of that legend using the commands scale_color_discrete() or scale_fill_discrete(). As you can see below, you can change the legend title with the argument name and the labels with the argument labels. Note that the labels argument also takes a vector, here produced with the concatenate function c().

# Scatter plot with title and axis labels
ggplot(data=dcps,aes(x=ProfMath,y=ProfLang,colour=SchType)) +
  geom_point() +
  ggtitle("Math and Language Proficiency in DC Public Schools (2018)") +
  xlab("Grade-level proficient in Math (% tested)") +
  ylab("Grade-level proficient in Language (% tested)") +
  scale_x_continuous(limits = c(0,100),breaks=seq(0,100,10)) +
  scale_y_continuous(limits = c(0,100),breaks=seq(0,100,10)) +
  scale_color_discrete(name=NULL,
                       labels=c("Grades 1-6","Grades 7-8","Grades 9-12"))