stat 579
Final Project
Due in class, Tuesday/Thursday Nov 17/19.
Data
For this project you are asked to identify both topic and data by yourself. You may work as part of a team.
The project has a written component with several stages. A first due date will be next week, the final write-up is due the week before the Thanksgiving break. The oral part of the projects consists of a presentation in class. We will have project presentations during the last week of classes and Finals' week.
The topic should be within the realm of general knowledge and interest. You may revisit one of the topics from class, but take care to try to explore VERY different aspects of the topic and make sure to also get additional/other sources for your data.
Deliverables
There are several parts to this project: In the first, you'll need to identify a suitable topic for your project. In the second step, work out how and which data you will use. These two steps are linked - you might have great ideas for topics, but if you cannot find suitable data, it might not be advisable to proceed.
Next, discuss questions you aim to answer, and download/incorporate the data necessary to answer them. In the last part, you'll try and answer your questions using the techniques discussed in class.
Some questions to think about:
- Are there other sources of data necessary to make valid cross-comparisons?
- Are there other sources of data that might be useful?
- What do you want to learn from this data?
- What data do you need to answer those questions?
- What data is available?
- What is your strategy for selecting data?
- How will you structure the data?
- What are the keys/id variables?
Due Dates
- Nov 2/4: one paragrph description of planned project; include potential data sources (web link, electronic data file and source); list of team members & team name
- Nov 16/18: (3+3*team size) pages of write-up
- Dec 16: final written project due.
Hints for writing a good project and presentation
-
Pick a topic of general interest - try it out on your family or friends (outside your area of study). If their eyes glaze over after 1 minute, it's not a topic of general interest.
-
Use current data - if you've found an exciting data set that is a couple years old, try to come up with a current source and build your own new data. Nobody likes news from last year or left-over analyses.
-
Show a good variety of different plots - the same type of plot is very tiring on your audience, try to mix it up. Coming up with different types of charts will also inspire you to broaden your analysis!
-
Don't show graphics or tables that you have not made yourself. Reproduce a graphic with your own data, if you like. Since you're supposed to demonstrate that you master the data, you should be able to. Showing off somebody else's work sheds a bad light on your skills.
-
Each figure or table must have a caption. The caption contains of 2-3 major pieces: a) a description of the construction of the table or graphic and the data source/data used (what is it? and how did you do it?),
b) the main implication of the graphic or table (why did you do it?), and, if at all possible, c) a secondary finding that might lead to the next question of interest.
-
While code is very important for this class, it does not have a place in the report (except for an electronic(!) appendix).
-
No Models! While this is a Statistical Methods class, we need to level the playing field for everybody. Only use the statistical methods emphasized in class, i.e. demonstrate computational proficiency in accessing, organizing and re-structuring data; impress me with your graphical exploration skills. Don't use any statistical models for this project beyond a linear regression of Y in X, if you absolutely have to.
-
Respond to all the feedback you get! Make the effort to think about questions and suggestions you got when you presented and react to it.
-
Proofread your paper! At a minimum, run a spellchecker over it. Typos don't make the grader happy :)
Grading rubric: written project
Overall grade breakdown:
- Introduction: 10
- Questions and findings: 60
- Conclusion: 10
- Presentation: 15
- Reproducibility: 10
The grading rubric that I'll use for the written project is available as a pdf.
Grading rubric: oral presentation
More details later.
Some great projects in the past: