Software development

Published:

encore.analytics - An R package with useful wrappers to streamline common workflows in the emulation of oncology trials

In the realm of oncology research, emulating clinical trials using real-world data presents unique challenges, particularly when dealing with missing data and the need for careful propensity score analyses. The encore.analytics package addresses these challenges by providing a comprehensive toolkit that bridges the gap between multiple imputation and propensity score methodologies.

install.packages("pak")
pak::pak("janickweberpals/encore.analytics")

Transparency and reproducibility

The Effective Statistician Podcast: The FAIRification Of Research In Real-World Evidence: A Practical Introduction To Reproducible Analytic Workflows Using

https://podcastae8fac.podigee.io/370-new-episode

smdi - An R package to perform routine structural missing data investigations in real-world data

Partially observed covariates are a common challenge in the analysis of electronic health records. Backed by large-scale simulations, this package eases and streamlines the implementation of routine missing data checks to characterize the underlying missingness and make informed decisions about the appropriate analytical choice for your study.

install.packages("smdi")
  • Link to package website

  • Package presentation at the New England Statistics Symposium 2023:

autoencoderPS - An autoencoder-based propensity score for causal inference

I’m fascinated by neural networks and deep learning and the following paper highlights some of my thoughts around utilizing these methods for causal inference:

Weberpals J, Becker T, Davies J, Schmich F, Rüttinger D, Theis FJ, Bauer-Mehren A. Deep learning-based propensity scores for confounding control in comparative effectiveness research: A large-scale, real-world data study. Epidemiology. 2021 May 1;32(3):378-88.

The development and analysis code published in the article “Deep learning-based propensity scores for confounding control in comparative effectiveness research: A large-scale, real-world data study” (Weberpals et al., Epidemiology, 2021) can be accessed on the following Github repository:

https://github.com/janickweberpals/autoencoderPS

The computing code used in this study is available as Python Jupyter Markdown scripts (.html) as supplementary material. All of the analyses described in the article were performed in R version 3.2.2. The PCA and autoencoder training was performed using sckit-learn and Keras with Tensorflow backend in Python version 3.6.0, respectively. The code that was used for the simulation is available as Rmarkdown.