PB4A7 - Data Analysis Replication Report and Extension Assignment

Assignment Task

Part 1 - Replication

In your report, you will have to briefly summarise the study and hypothesis, and then replicate, or at least attempt to replicate, the main result(s) with at least one accompanying graph. To replicate the results, you should write an original Stata program in a do-file that re-estimates at least the main model from the paper.

Part 2 - Critique and extension

Critique the study, comment on the motivation, theory, model, estimation, interpretation, data, any issues you find with the data, any threats to internal and external validity. You can reference other published papers as appropriate. Give a brief description of how you would extend the results of the paper given the chance. What would you do differently? What could you do better? What would you do next? What are the potential issues of RD designs that we need to be aware of? Propose an alternative technique that can address the limitations pointed out and how this could be conducted in a new study.

Main instruction

You will receive a cleaned dataset, so you do not have to clean and merge the datasets yourself – you will focus on the data analysis only. You must write an original program to replicate the main result(s) in the paper. The STATA programmes provided by the authors are available for your reference and for your learning; do not use these programs in your replication but rather the do file provided in moodle. The idea is that you should do your best to replicate the main finding by estimating the models. If you can replicate the results, report your results. If you come close, report that and explain your reasoning. If you can’t replicate the results, explain why (is it because you can’t figure it out, or can assumptions made in the paper be challenged, or some combination of those?).

The replication will require 6 questions:

  1. Create the variables for Driving Under the Influence, and a quadratic term for Blood alcohol level and produce two histograms of BAC1 (one as a discrete variable, one as a continuous). Explain the differences in the histograms.

  2. Running regressions on covariates (white, male, age and accident) to see if there is a jump in average values for each of these at the cutoff and explain the results.

  3. Produce main recidivism results of the paper (with our dataset) using recidivism as the dependent variable as well as with a changed bandwidth of the RDD to 0.055 to 0.105.

  4. Replicate Q3 by running "donut hole regressions". Explain why one might need a donut hole regression? How do I run a donut hole regression?

  5. Run local polynomials and explain why local polynomials might be needed after a donut hole.

  6. Produce an RD plot and a cmogram with bandwidths of 0.060 & 0.11. Explain what the respective graphs show.

2. Poster of the data analysis replication report

Prepare a poster summarising both the replication and the critique and extension. You can use an A0 (841 x 1189 mm or 33.1 x 46.8 inches) poster. The poster should adopt a clear and logical layout and should be informative to a person unfamiliar with the original study or replication.

For the Data Analysis Replication Report and Extension

  • You can access the paper from the American Economic Review journal website. The do file and cleaned dataset are available on Moodle.
  • The word limit excludes: Cover page, references in the bibliography, text in tables and figures, table and figure titles and notes and the appendix. It includes: All in-text citations, text in heading and sub-headings and footnotes.
  • Use Times New Roman Font, font size 11, and 1.5 spacing, justified.
  • Number sections and subsections sequentially: 2, 2.1, 2.1.1, and so on.
  • Number your appendices sequentially and refer to them in the text where appropriate.
  • References: Use APA6th style of references.
  • All references mentioned in the Reference List must be cited in the text, and vice versa.
  • Please make sure you check this before submitting.

Other presentation considerations:

  • All identifying information has been removed from the manuscript, including the author name and student ID number. Only your candidate number should be put in the cover sheet.
  • The submission has been `spell-checked` and `grammar-checked`.
  • Tables should not be a picture of Stata output.
  • The assessment criteria will include considerations like clarity and precision of presentation, understanding of topic, theories and methods, and wider literature, critical insight and ability to appropriately evaluate different ideas, practical applications and use of statistical packages and tools, coherence of structure and argument.

For the poster

  • No restrictions on formatting etc., so use your imagination!
  • The assessment criteria for the poster will include considerations like clear and accurate communication of all key scientific information, logically laid-out and easy to follow, eye-catching and visually appealing, well-researched, appropriately referenced, and well-prepared, analytical, critical and synthetic, the poster shows understanding of the topic.

WhatsApp icon