In a recent study posted to the medRxiv* preprint server, researchers examined the accuracy of the US Centers for Disease Control and Prevention (CDC) coronavirus disease 2019 (COVID-19) forecasting models.
Accurate predictive modeling of pandemic outcomes plays a critical role in developing strategies and policies to curb the extent of the pandemic. While several prediction models have been considered, their accuracy and robustness over time and different models are unclear.
About the study
In the present study, the researchers analyzed all US CDC COVID-19 forecasting models by categorizing them as per model type and estimating their mean percent error over different COVID-19 infection waves.
The team compared several US CDC COVID-19 forecasting models according to their quantitative characteristics by measuring their performance over various periods. The US CDC compiles COVID-19 case-related weekly forecasts in four different time periods, including one week two weeks, three weeks, and four weeks. The models make a new forecast every week for new COVID-19 cases incident in each of the four subsequent weeks. The forecast horizon was considered as the time span for which the forecast was to be prepared. The present study focused on assessing the performance of four-week forecast models.
The forecasting models were differentiated into five categories namely, ensemble, epidemiological, hybrid, and machine learning. The team examined a total of 51 models. The CDC model uses an ensemble model and the researchers assessed if this model was more accurate than any individual model. Mean absolute percent error (MAPE) was evaluated and reported for each model studied and the models were compared according to their performance in each wave. The team defined waves as (1) Wave I: 6 July 2020 to 31 August 2020; (2) Wave II: 1 September 2020 to 14 February 2021; (3) Wave III: 15 February 2021 to 26 July 2021; and (4) Wave IV: 27 July 2021 to 17 January 2022.
The performance of the forecasting models was calculated according to two baselines. Baseline-I was the ‘CovidHub-Baseline’ (or CDC’s baseline) that evaluated the most recent infection incidence as the median prediction of future horizons. Baseline-II took into account the extrapolation of the linear predictor in reported active COVID-19 cases between two weeks before the date of the forecast. The team only considered the models that had made predictions for a minimum of 25% of the target dates studied.
The study results showed that during the first wave of the COVID-19 pandemic, the MAPE values were 14% for the Columbia_UNC-SurvCon, 17% for the USACE-ERDC_SEIR, and 25% for the CovidAnalytics-DELPHI models. Among the four models that performed better as compared to the two baselines, three were epidemiological models and one was a hybrid model. The team also inferred that the hybrid models performed better than the rest and had the lowest MAPE, followed by the epidemiological and subsequently the machine learning models. In contrast, the ensemble models had the highest MAPE in the first wave while none of the models crossed the threshold of the MAPE of baseline-I.
During the second COVID-19 wave, the IQVIA_ACOE-STAN model performed the best with a 5.5% MAPE. A total of 13 models surpassed both the baselines with a MAPE between five and 37. The best-performing models in this wave included five ensemble models, four epidemiological models, two machine learning models, and two hybrid models. Notably, all the ensemble models surpassed the performance of the first baseline with a MAPE of 37%, except the UVA-Ensemble model. Also, a staggering distribution in MAPE values was observed for the epidemiological models. Furthermore, as opposed to wave I, the ensemble models predicted the most accurate forecasts in wave II while the hybrid models were the least accurate.
During wave III, the performance of the ensemble models was comparable to the first wave. Moreover, the baselines models reported a comparatively higher MAPE with the MAPE values at baselines I and II being 74% and 77%, respectively. In this wave, the best performing model was the USC-SI_kJalpha which had a MAPE of 32%. A total of 32 models showed better performance than that of the baseline models, including 12 compartment models, three machine learning models, four hybrid models, eight ensemble models, and five un-categorized models.
In the fourth wave of the pandemic, a few models had a MAPE of 28% while the baseline MAPE was 47%. While the ensemble models performed the best in this time period, the epidemiological models showed the highest MAPE. The MAPE scores of baseline I and II were 47% and 48%, respectively.
To summarize, the study findings showed that there were no significant differences in the accuracy of the different CDC COVID-19 forecasting models. Furthermore, the error rate in the models increased over time through the pandemic. The researchers believe that the present study can serve as a foundation for the development of more accurate and robust forecasting models.
medRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.