Applying ggplot2 to Real Data

NOTE

You must turn in a PDF document of your R Markdown code. Submit this to D2L by 11:59 PM Eastern Time on Monday, September 18th.

Preliminaries

As always, we will first have to load ggplot2. To do this, we will load the tidyverse by running this code:

library(tidyverse)

Background

The New York City Department of Buildings (DOB) maintains a list of construction sites that have been categorized as “essential” during the city’s shelter-in-place pandemic order. They’ve provided an interactive map here where you can see the different projects. There’s also a link there to download the complete dataset.

For this exercise, you’re going to use this data to visualize the amounts or proportions of different types of essential projects in the five boroughs of New York City (Brooklyn, Manhattan, the Bronx, Queens, and Staten Island).

As you hopefully figured out by now, you’ll be doing all your R work in R Markdown. You can use an RStudio Project to keep your files well organized (either on your computer or on RStudio.cloud), but this is optional. If you decide to do so, either create a new project for this exercise only, or make a project for all your work in this class.

You’ll need to download one CSV file and put it somewhere on your computer (or upload it to RStudio.cloud if you’ve gone that direction)—preferably in a folder named data in your project folder. You can download the data from the DOB’s map, or use this link to get it directly:

R Markdown

Writing regular text with R Markdown follows the rules of Markdown. You can make lists; different-size headers, etc. This should be relatively straightfoward. We talked about a few Markdown features like bold and italics in class. See this resource for more formatting.

You’ll also need to insert your own code chunks where needed. Rather than typing them by hand (that’s tedious and you might miscount the number of backticks!), use the “Insert” button at the top of the editing window, or type ctrl + alt + i on Windows, or + + i on macOS.

Data Prep

Once you download the EssentialConstruction.csv file and save it in your project folder, you can open it and start cleaning. I’ll help with that. I’ll give you a .Rmd that will get you started on the Exercises below. Download this and use it in place of your lab assignment template.

Exercise 1 of 1: Essential pandemic construction

Using the Lab 02 template above, do the following:

A. Show the count or proportion of approved projects by borough using a bar chart. Make sure all the elements of your plot (axes, legend, etc.) are labeled.

B. Show the count or proportion of approved projects by category using a lollipop chart. Not sure of what a lollipop chart is? Google R ggplot lollipop. A huge portion of knowing how to code is knowing how to google, find examples, and figure out where to put your variables from your data! Make sure all the elements of your plot (axes, legend, etc.) are labeled.

You don’t need to make these super fancy, but if you’re feeling brave, experiment with adding a labs() layer or changing fill colors with scale_fill_manual() or with palettes.

Bonus

Overlay the data from Part 1 above onto a map of NYC. Make sure all the elements of your plot (axes, legend, etc.) are labeled.

Getting help

Use the SSC442 Slack if you get stuck (click the Slack logo at the top right of this website header).

Turning everything in

When you’re all done, click on the “Knit” button at the top of the editing window and create a PDF. Upload the PDF file to D2L.