Applied Data Wrangling

You must turn in a PDF document of your R Markdown code. Submit this to D2L by 11:59 PM Eastern Time on *Monday December 4th.**

Data Wrangling Continued…

You still work for a travel booking website as a data analyst. The hotel has once again asked your company for data on corporate bookings at the hotel via your site. Specifically, they have five corporations that are frequent customers of the hotel, and they want to know who spends the most with them. They’ve asked you to help out. Most of the corporate spending is in the form of room reservations, but there are also parking fees that the hotel wants included in the analysis. Your goal: total up spending by corporation and report the biggest and smallest spenders inclusive of rooms and parking.

You did this already in class, but your boss now has some different data. It’s similar, and your code from before will help, but it has some new wrinkles in it to tackle. Do not use the data from this week’s Example. Here’s your new data:

  • booking.csv - Contains the corporation name, the room type, and the dates someone from the corporation stayed at the hotel. It was pulled by an intern who doesn’t understand date-time stamps.

  • roomrates.csv - Contains the price of each room on each day. The Lab 11 version of this data is from our German affiliate, so pay attention to the date format, and be careful – it’s very wide, so using head will throw a lot of data at you.

  • parking.csv - Contains the corporations who negotiated free parking for employees. It has been updated.

  • Parking at the hotel is $60 per night if you don’t have free parking. This hotel is in California, so everyone drives and parks when they stay.

(note: data is labeled Lab 11, but this is indeed Lab 13)

EXERCISE 1

  1. As you did in class, but with your new set of data, total up spending by corporation and report the biggest and smallest spenders inclusive of rooms and parking

  2. Visualize (using ggplot) each corporation’s spending at the hotel over time and by roomtype. Make one plot with ggplot that shows this.

  3. Visualize (using ggplot) the room rates over time by room type. Can you pick out one factor that determines when room prices are higher than usual? Note that we know each corporation gets the same room rate as the others on the same day, so this is about room rates, not corporate spending. Make two total plots, the first showing the room rates over time by room type, and the second explaining some feature of one of the room rates (e.g. when is the double room rate high? When is it low?). Using the month(...), day(...) or wday(..., label = TRUE) functions from lubridate will help with figuring out the patterns. Try exploring just one of the room types to start. You don’t have to perfectly analyze the room rate, just find one facet of the rate that changes regularly over time.