Poster: Rapid Deployment of a Machine Learning-based Derived Biomarker using Publicly Available Data Sources for Covariate Adjusted Descriptive Modeling

Presented at the 2019 ASA Symposium on Data Science and Statistics in Bellevue, Washington on May 31, 2019.

Background: Defining baseline characteristics for covariate-adjusted analyses to increase study power is not new. Multifactorial heterogeneous diseases including Amyotrophic Lateral Sclerosis (ALS), Alzheimer’s Disease (AD), Parkinson’s Disease (PD), and Huntington’s Disease (HD) present a challenge in defining baseline covariates that add substantial benefit to study power. We developed a methodology for training machine-learning (ML) models that utilizes historical clinical trial patient data to provide a single prediction value to be used as a covariate in a trial’s statistical analysis. We have adapted this methodology across disease areas and have developed a rigorous audit methodology based on best practices in the biostatistics field so that these new methods can be more easily shared across a field where rigorous vetting of new technologies is critical to adoption.

Objectives: To demonstrate through clinical trial simulation:

A methodology for adopting rigorous methods for analysis dataset preparation for ML modeling
A practical application of ML models to traditional biostatistical analysis
A scalable approach that is applicable to multiple heterogeneous disease areas in which a suitable covariate is lacking

Authors: Albert A. Taylor, Danielle Beaulieu, Dustin Pierce, Andrew Conklin, Jonavelle Cuerdo, Mike Keymer, David L. Ennist

Poster: Rapid Deployment of a Machine Learning-based Derived Biomarker using Publicly Available Data Sources for Covariate Adjusted Descriptive Modeling

Download this Poster by Entering Your Email Address Below: