Data Wrangling

Content for Thursday, November 30, 2023

Data to be wrangled

You work for a travel booking website as a data analyst. A hotel has asked your company for data on corporate bookings at the hotel via your site. Specifically, they have five corporations that are frequent customers of the hotel, and they want to know who spends the most with them. They’ve asked you to help out. Most of the corporate spending is in the form of room reservations, but there are also parking fees that the hotel wants included in the analysis. Your goal: total up spending by corporation and report the biggest and smallest spenders inclusive of rooms and parking.

Unfortunately, you only have the following data:

  • booking.csv - Contains the corporation name, the room type, and the dates someone from the corporation stayed at the hoted.

  • roomrates.csv - Contains the price of each room on each day

  • parking.csv - Contains the corporations who negotiated free parking for employees

  • Parking at the hotel is $60 per night if you don’t have free parking. This hotel is in California, so everyone drives and parks when they stay.

Some tips:

  • Right-click on each of the links, copy the address, and read the URL in using read.csv to read .csv’s

  • You’ll find you need to use most of the tools we covered on Tuesday including gather, separate and more.

  • You’ll need lubridate and tidyverse loaded up.

Your lab wil be based on similar data (with more wrinkles to fix) so share your code with your group when you’re done.