Select Page

Archives

How to Use ggplot: Step Nine in Learning R Programming for Free
Posted on December 12, 2018
Author: Linda Stewart, Performance Architects

I hope you are enjoying the “Learning R Programming for Free” series; here are links to the previous segments (Step One, Step Two, Step Three, Step Four, Step Five, Step Six, Step Seven, Step Eight) to provide some helpful background.

In the previous installment, we learned about data frames, which are particularly useful for data sets because we can combine different types into one single object like we do when we create a spreadsheet.  We also learned that the core data type for building a data frame is a vector, but that a data frame can actually contain more complex data structures.

In this segment, I promised I would discuss “ggplot.”  According to the help file, ggplot “can be used to declare the input data frame for a graphic and to specify the set of plot aesthetics intended to be common throughout all subsequent layers unless specifically overridden.”  Well, that sounds rather complex!

Let’s look at the code template:

ggplot(df, aes(x, y, other aesthetics))

So, it takes a “df” (a data frame).  We know about those!  Our axes will be “x” and “y.”  Aesthetics will include items such as the chart type and colors.

To use ggplot, we need to add in a new component.  Components are added using the “library()” function.  First, we need to install the ggplot package:

Open the RConsole: To install ggplot, type:

> install.packages(“ggplot2”)

It should be easy, but you may need to try a couple of mirror sites to get the package to install. Now, load ggplot in RStudio:

# Load ggplot (install it first if you need to)

library(ggplot2)

Let’s plot the cars from Installment 8, using the “mtcars” data frame:

library(ggplot2)
p <- ggplot(mtcars, aes(wt, mpg))
p

Over on the right side of the RStudio IDE, there is a tabbed area in the lower right panel.  Select the “Plot” tab: We have a plot but we cannot see any data points…look at the scale on the x axis.  No cars weigh between “0” and “6” yet.  And we don’t know what 0-6 measures!  Let’s fix that.

> p <- ggplot(mtcars, aes(wt, mpg))
> p + geom_point(size=4)

Fix the X axis label:

> p <- ggplot(mtcars, aes(wt, mpg))
> p + labs(x = “wt (pounds)”) + geom_point(size=4) Let’s color our plot:

p <- ggplot(mtcars, aes(wt, mpg, colour = cyl)) + geom_point(size=4)
p + labs(x = “wt (pounds)”)

Notice now that we have a colored dot for the number of cylinders in the car engine and a legend bar on the right to explain the dot colorings: Notice how as the car weight and number of cylinders increase, the mpg is reduced.

Let’s add a title to our plot:

p + labs(title = “Cars: MPG based on Vehicle Weight and Cylinders”) There you have it, some basics of ggplot.

In our next installment, I’ll discuss how to process Excel files using R and how to convert them to CSV.