Skip to content

Introduction to the ggplot2 Package

One way to plot in R, is to use the ggplot2 package in the Tidyverse. “gg” stands for the “Grammar of Graphics” and the package makes for a convenient and consistent way to make almost any plot you want. Typically it’s easiest to plot data in ggplot2 using a dataframe, which means we usually don’t need to make major changes to our data if we decide we want to change how we’re plotting something in particular.

Components of a Good Plot

The function used to make a plot is ggplot(). This comes from the ggplot2 package. As a reminder, first using the ggplot2 package we have to install it (just the first time we use it!) with install.packages("ggplot2") and then for every time after that we want it, we don’t need to re-install it, we just need to call the package at the beginning of our script with library(ggplot2).

There is a consistent template that we’ll need to use to get our plots to work:

ggplot(data = <DATA>,
  mapping = aes(<MAPPINGS>)) +
  <GEOM_FUNCTION>()

Here we have three main components:

the data call – this will almost always refer to a dataframe

the aesthetic mappings – these are the variables we’re plotting, and the specifications of how we want them to be displayed

the geom function – each type of plot has it’s own geom function (i.e. geom_point() for a scatterplot, geom_line() to plot timeseries, etc.)

A small point, note here that unlike other Tidyverse packages, we do not use the pipe %>% to link different functions, we use an addition operator +.

By way of a quick example, we’ll make a super quick and dirty scatterplot of some long-term ecological data on salamanders and trout.

library(tidyverse)
library(lterdatasampler)

df <- lterdatasampler::and_vertebrates

names(df)
##  [1] "year"        "sitecode"    "section"     "reach"      
##  [5] "pass"        "unitnum"     "unittype"    "vert_index" 
##  [9] "pitnumber"   "species"     "length_1_mm" "length_2_mm"
## [13] "weight_g"    "clip"        "sampledate"  "notes"

The most basic version of the plot we may want to make is a scatter plot of two continuous variables, let’s say length and weight:

ggplot() +
geom_point(data = df, aes(x = length_1_mm, y = weight_g))

Here you notice we place the aesthetics arguments (mapping = aes()) in each geom argument individually. This will allow us to plot using multiple dataframes or variables.

In the Iterative Plotting section we’ll go over how to add components to go from an ugly plot like this to a publication-ready graphic.