Presented at the 2019 ASA Symposium on Data Science and Statistics in Bellevue, Washington on May 31, 2019.  

Background: Defining baseline characteristics for covariate-adjusted analyses to increase study power is not new. Multifactorial heterogeneous diseases including Amyotrophic Lateral Sclerosis (ALS), Alzheimer’s Disease (AD), Parkinson’s Disease (PD), and Huntington’s Disease (HD) present a challenge in defining baseline covariates that add substantial benefit to study power. We developed a methodology for training machine-learning (ML) models that utilizes historical clinical trial patient data to provide a single prediction value to be used as a covariate in a trial’s statistical analysis. We have adapted this methodology across disease areas and have developed a rigorous audit methodology based on best practices in the biostatistics field so that these new methods can be more easily shared across a field where rigorous vetting of new technologies is critical to adoption.

Objectives: To demonstrate through clinical trial simulation:

  • A methodology for adopting rigorous methods for analysis dataset preparation for ML modeling
  • A practical application of ML models to traditional biostatistical analysis
  • A scalable approach that is applicable to multiple heterogeneous disease areas in which a suitable covariate is lacking

Authors: Albert A. Taylor, Danielle Beaulieu, Dustin Pierce, Andrew Conklin, Jonavelle Cuerdo, Mike Keymer, David L. Ennist

Download this Poster by Entering Your Email Address Below:

Share This