Skip to main content
Skip to footer
Cerner RWD Publications Logo - Light
Statistical learning
for predicting
severity among
cardiovascular patients

A super learner ensemble of 14 statistical learning models for predicting COVID-19 severity among patients with cardiovascular conditions.



Cardiovascular and other circulatory system diseases have been implicated in the severity of COVID-19 in adults. This study provides a super learner ensemble of models for predicting COVID-19 severity among these patients.


The COVID-19 Dataset of the Cerner Real-World Data was used for this study. Data on adult patients (18 years or older) with cardiovascular diseases between 2017 and 2019 were retrieved and a total of 13 of these conditions were identified. Among these patients, 33,042 admitted with positive diagnoses for COVID-19 between March 2020 and June 2020 (from 59 hospitals) were identified and selected for this study. A total of 14 statistical and machine learning models were developed and combined into a more powerful super learning model for predicting COVID-19 severity on admission to the hospital.


LASSO regression, a full extreme gradient boosting model with tree depth of 2, and a full logistic regression model were the most predictive with cross-validated AUROCs of 0.7964, 0.7961, and 0.7958 respectively. The resulting super learner ensemble model had a cross validated AUROC of 0.8006 (range: 0.7814, 0.8163). The unbiased AUROC of the super learner model on an independent test set was 0.8057 (95% CI: 0.7954, 0.8159).


Highly predictive models can be built to predict COVID-19 severity of patients with cardiovascular and other circulatory conditions. Super learning ensembles will improve individual and classical ensemble models significantly.