Empirical evidence of the impact of study characteristics on the performance of prediction models: a meta-epidemiological study

Damen JAAG, Debray TPA, Pajouheshnia R, Reitsma JB, Scholten RJPM, Moons KGM, Hooft L

Objectives: To empirically assess the relation between study characteristics and prognostic model performance in external validation studies of multivariable prognostic models.

Design: Meta-epidemiological study.

Data sources and study selection: On 16 October 2018, we searched electronic databases for systematic reviews of prognostic models. Reviews from non-overlapping clinical fields were selected if they reported common performance measures (either the concordance (c)-statistic or the ratio of observed over expected number of events (OE ratio)) from 10 or more validations of the same prognostic model.

Data extraction and analyses: Study design features, population characteristics, methods of predictor and outcome assessment, and the aforementioned performance measures were extracted from the included external validation studies. Random effects meta-regression was used to quantify the association between the study characteristics and model performance.

Results: We included 10 systematic reviews, describing a total of 224 external validations, of which 221 reported c-statistics and 124 OE ratios. Associations between study characteristics and model performance were heterogeneous across systematic reviews. C-statistics were most associated with variation in population characteristics, outcome definitions and measurement and predictor substitution. For example, validations with eligibility criteria comparable to the development study were associated with higher c-statistics compared with narrower criteria (difference in logit c-statistic 0.21(95% CI 0.07 to 0.35), similar to an increase from 0.70 to 0.74). Using a case-control design was associated with higher OE ratios, compared with using data from a cohort (difference in log OE ratio 0.97(95% CI 0.38 to 1.55), similar to an increase in OE ratio from 1.00 to 2.63).

Conclusions: Variation in performance of prognostic models across studies is mainly associated with variation in case-mix, study designs, outcome definitions and measurement methods and predictor substitution. Researchers developing and validating prognostic models should realise the potential influence of these study characteristics on the predictive performance of prognostic models.

This article is distributed under the terms of the Creative Commons Attribution 4.0 Non Commercial International License, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial.

CC BY-NC 4.0