Reproducible reporting

An introduction to Quarto

Division of Pharmacoepidemiology and Pharmacoeconomics
Brigham and Women’s Hospital
Harvard Medical School

August 25, 2024

Problem statement

Wait, but how was that done exactly?

Problem statement

Wait, but how was that done exactly?

  • More often than not, statistical and computational methods are reported and phrased ambiguously

    “We measured the pre-exposure performance status within 90 days of the index date.”

  • Does the 90-day window include or exclude the index date? What was done if there were multiple performance assessments per patient? …

  • Take a moment and reflect if you would be able to exactly reproduce a study you published 10 years just based on the paper’s methods section?

Is there a reproducibility crisis?

Nature survey: More than 70% of researchers have tried and failed to reproduce another scientist’s experiments, and more than half have failed to reproduce their own experiments (Baker 2016)

What if…

What if…

If there was just a way to combine…

  • the narrative prose that explains the methods used

  • the analytic code we implemented to execute these methods

  • the corresponding results

…all in one report.

Literate programming

Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do (Donald Knuth, Turing Award recipient)

Definition

History of literate programming

  • Literate programming is a concept pioneered by Donald Knuth, a Turing Award recipient known for creating TeX.

  • The main idea behind the early form of literate programming was to upend the traditional programming practices of the time by systematically including human readable text accompanying and explaining the logic and the purpose of a program.

  • As he describes in “Literate Programming”, Knuth considers the programmer as an “essayist” who should strive to communicate the purpose of a program in order to create better code.

  • While initially centered in the domain of computer science, it more recently resurged in the interdisciplinary world of “data science”.

https://bernhardbieri.ch/blog/2022-08-25-litteralprogramminginstata/

Introduction to Quarto

Examples

Reproducible projects and manuscripts

References

Baker, Monya. 2016. “1,500 Scientists Lift the Lid on Reproducibility.” Nature 533 (7604): 452–54. https://doi.org/10.1038/533452a.