Assignments
This course is the capstone of the Data Analytics Minor in the College of Social Science. Accordingly, you should—fingers crossed—enjoy data analysis. You will get the most of out this class if you:
- Engage with the readings and lecture materials
- Regularly use
R
(aka engage daily or almost every day in some way)
Each type of assignment in this class helps with one of these strategies.
Download and save the following files (right-click to Save Link As…)
Weekly Writings
To encourage you to actively engage with the course content, you will write a ≈150 word memorandum about the reading or lecture each week. That’s fairly short: there are ≈250 words on a typical double-spaced page. You must complete eleven of these in the course. I will drop your one lowest weekly writing score. Your actual prompt will be assigned in class, so you must login each day to ensure you get these assignments. To keep you on your toes, we will vary whether these are assigned on Tuesdays or Thursdays. Each week’s weekly writing will be due on D2L by 11:59pm on Saturday
You can do a lot of different things with this memo: discuss something you learned from the course content, write about the best or worst data visualization you saw recently, connect the course content to your own work, etc. These reflections let you explore and answer some of the key questions of this course, including:
- When is a link correlational vs causal? How can we still make useful statements about non-causal things?
- Why do we visualize data?
- What makes a great data analysis? What makes a bad analysis?
- How do you choose which kind of analysis method to use?
- What is the role of the data structure in choosing an analysis? Can we be flexible?
The course content for each day will also include a set of questions specific to that topic. You do not have to answer all (or any) of these questions. That would be impossible. They exist to guide your thinking and to make complex reading more digestible. The specific topic for each week will be assigned in class. (We can’t emphasize this enough.)
The TA will grade these mini-exercises using a very simple system:
- ✔+: (9.2 points (115%) in gradebook) Work shows phenomenal thought and engagement with the course content. We will not assign these often.
- ✔: (8 points (100%) in gradebook) Work is thoughtful, well-written, and shows engagement with the course content. This is the expected level of performance.
- ✔−: (4 points (50%) in gradebook) Work is hastily composed, too short, and/or only cursorily engages with the course content. This grade signals that you need to improve next time. I will hopefully not assign these often.
(There is an implicit 0 above for work that is not turned in by Saturday at 11:59pm). Notice that this is essentially a pass/fail or completion-based system. We’re not grading your writing ability; we’re not counting the exact number of words you’re writing; and we’re not looking for encyclopedic citations of every single reading to prove that you did indeed read everything. We are looking for thoughtful engagement. Read the material, engage with the work and you’ll get a ✓.
Weekly Writing Template
You will turn these reflections in via D2L. You will write them using R Markdown
and this weekly writing template (right-click to Save Link As…) . You must knit
your work to a PDF document (this will be what you turn in). D2L will have eleven weekly writing assignments available. Upload your first weekly writing assignment to number 1, your second (regardless of which week you are writing on) to number 2, etc.
Labs
Each week of the course has fully annotated examples of code that teach and demonstrate how to do specific tasks in R
. However, without practicing these principles and making graphics on your own, you won’t remember what you learn.
Please do not do labs more than one week ahead of time. I am updating the assignments as the semester proceeds, and you may do an entire assignment that is completely changed.
For example, to practice working with ggplot2 and making data-based graphics, you will complete a brief set of exercises over a few class sessions. These exercises will have 1–3 short tasks that are directly related to the topic for the week. You need to show that you made a good faith effort to work each question. There will also be a final question which requires significantly more thought and work. This will be where you get to show some creativity and stretch your abilities. Overall, labs will be graded the same check system:
- ✔+: (17.5 points (115%) in gradebook) Exercises are complete. Every task was attempted and answered, and most answers are correct. Knitted document is clean and easy to follow. Work on the final problem shows creativity or is otherwise exceptional. We will not assign these often.
- ✔: (15 points (100%) in gradebook) Exercises are complete and most answers are correct. This is the expected level of performance.
- ✔−: (7.5 points (50%) in gradebook) Exercises are less than 70% complete and/or most answers are incorrect. This indicates that you need to improve next time. We will hopefully not assign these often, but subpar work can expect a ✔−.
There is an implicit 0 for any assignment not turned in on time. If you have only partial work, then turn that in for partial credit. As noted in the syllabus, we are not grading your coding ability. We are not checking each line of code to make sure it produces some exact final figure, and we do not expect perfection. Also note that a ✓ does not require 100% success. You will sometimes get stuck with weird errors that you can’t solve, or the demands of pandemic living might occasionally become overwhelming. We are looking for good faith effort. Try hard, engage with the task, and you’ll get a ✓.
You may work together on the labs, but you must turn in your own answers.
Lab Template
You will turn these labs in via D2L. You will write them using R Markdown
and this lab assignment template. You must knit
your work to a PDF document (this will be what you turn in). Your output must be rendered in latex. I do not accept rendering to HTML or Word and then converting to PDF.
Projects
To give you practice with the data and design principles you’ll learn in this class, you will complete two projects en route to the overarching final project of the course. Both these mini projects and the final project must be completed in groups. I will assign groups after the drop deadline passes. Groups will be 2-3 people. You are allowed to form your own groups, but I will assign groups. More details will follow later.
The two (mini) projects are checkpoints to ensure you’re working on your project seriously. They will be graded using a check system:
- ✔+: (55 points (≈115%) in gradebook) Project is phenomenally well-designed and uses advanced R techniques. The project uncovers an important story that is not readily apparent from just looking at the raw data. I will not assign these often.
- ✔: (50 points (100%) in gradebook) Project is fine, follows most design principles, answers a question from the data, and uses R correctly. This is the expected level of performance.
- ✔−: (25 points (50%) in gradebook) Project is missing large components, is poorly designed, does not answer a relevant question, and/or uses R incorrectly. This indicates that you need to improve next time. I will hopefully not assign these often.
Because these mini projects give you practice for the final project, we will provide you with substantial feedback on your design and code.
Final project
At the end of the course, you will demonstrate your skills by completing a final project. Complete details for the final project (including past examples of excellent projects) are here. In brief, the final project has the following elements:
- You must find existing data to analyze.1 Aggregating data from multiple sources is encouraged, but is not required.
- You must visualize (at least) three interesting features of that data. Visualizations should aid the reader in understanding something about the data that might not be readily aparent.2
- You must come up with some analysis—using tools from the course—which relates your data to either a prediction or a policy conclusion. For example, if you collected data from Major League Baseball games, you could try to “predict” whether a left-hander was pitching based solely on the outcomes of the batsmen.3
- You must write your analysis as if presenting to a C-suite executive. If you are not familiar with this terminology, the C-suite includes, e.g., the CEO, CFO, and COO of a given company. Generally speaking, such executives are not particularly analytically oriented, and therefore your explanations need to be clear, consise (their time is valuable) and contain actionable (or valuable) information.
There is no final exam. This project is your final exam.
The project will not be graded using a check system, and will be graded by me (the main instructor, not a TA). I will evaluate the following four elements of your project:
- Technical skills: Was the project easy? Does it showcase mastery of data analysis?
- Visual design: Was the information smartly conveyed and usable? Was it beautiful?
- Analytic design: Was the analysis appropriate? Was it sensible, given the dataset?
- Story: Did we learn something?
If you’ve engaged with the course content and completed the exercises and mini projects throughout the course, you should do just fine with the final project.
Note that existing is taken to mean that you are not permitted to collect data by interacting with other people. That is not to say that you cannot gather data that previously has not been gathered into a single place—this sort of exercise is encouraged.↩︎
Pie charts of any kind will result in a 25% grade deduction.↩︎
This is an extremely dumb idea for a number of reasons. Moreover, it’s worth mentioning that sports data, while rich, can be overwhelming due to its sheer magnitude and the variety of approaches that can be applied. Use with caution.↩︎