Filter
Topic
Showing 4 of 4 publications
Various statistical and machine learning algorithms can be used to predict treatment effects at the patient level using data from randomized clinical trials (RCTs). Such predictions can facilitate individualized treatment decisions. Recently, a range of methods and metrics were developed for assessing the accuracy of such predictions. Here, we extend these methods, focusing on the case of survival (time-to-event) outcomes. We start by providing alternative definitions of the participant-level treatment benefit; subsequently, we summarize existing and propose new measures for assessing the performance of models estimating participant-level treatment benefits. We explore metrics assessing discrimination and calibration for benefit and decision accuracy. These measures can be used to assess the performance of statistical as well as machine learning models and can be useful during model development (i.e., for model selection or for internal validation) or when testing a model in new settings (i.e., in an external validation). We illustrate methods using simulated data and real data from the OPERAM trial, an RCT in multimorbid older people, which randomized participants to either standard care or a pharmacotherapy optimization intervention. We provide R codes for implementing all models and measures.
Aims: Clinical guidelines often recommend treating individuals based on their cardiovascular risk. We revisit this paradigm and quantify the efficacy of three treatment strategies: (i) overall prescription, i.e. treatment to all individuals sharing the eligibility criteria of a trial; (ii) risk-stratified prescription, i.e. treatment only to those at an elevated outcome risk; and (iii) prescription based on predicted treatment responsiveness.
Methods and results: We reanalysed the PROSPER randomized controlled trial, which included individuals aged 70–82 years with a history of, or risk factors for, vascular diseases. We conducted the derivation and internal–external validation of a model predicting treatment responsiveness. We compared with placebo (n = 2913): (i) pravastatin (n = 2891); (ii) pravastatin in the presence of previous vascular diseases and placebo in the absence thereof (n = 2925); and (iii) pravastatin in the presence of a favourable prediction of treatment response and placebo in the absence thereof (n = 2890). We found an absolute difference in primary outcome events composed of coronary death, non-fatal myocardial infarction, and fatal or non-fatal stroke, per 10 000 person-years equal to: −78 events (95% CI, −144 to −12) when prescribing pravastatin to all participants; −66 events (95% CI, −114 to −18) when treating only individuals with an elevated vascular risk; and −103 events (95% CI, −162 to −44) when restricting pravastatin to individuals with a favourable prediction of treatment response.
Conclusion: Pravastatin prescription based on predicted responsiveness may have an encouraging potential for cardiovascular prevention. Further external validation of our results and clinical experiments are needed.
In recent years, there has been a growing interest in the prediction of individualized treatment effects. While there is a rapidly growing literature on the development of such models, there is little literature on the evaluation of their performance. In this paper, we aim to facilitate the validation of prediction models for individualized treatment effects. The estimands of interest are defined based on the potential outcomes framework, which facilitates a comparison of existing and novel measures. In particular, we examine existing measures of discrimination for benefit (variations of the c-for-benefit), and propose model-based extensions to the treatment effect setting for discrimination and calibration metrics that have a strong basis in outcome risk prediction. The main focus is on randomized trial data with binary endpoints and on models that provide individualized treatment effect predictions and potential outcome predictions. We use simulated data to provide insight into the characteristics of the examined discrimination and calibration statistics under consideration, and further illustrate all methods in a trial of acute ischemic stroke treatment. The results show that the proposed model-based statistics had the best characteristics in terms of bias and accuracy. While resampling methods adjusted for the optimism of performance estimates in the development data, they had a high variance across replications that limited their accuracy. Therefore, individualized treatment effect models are best validated in independent data. To aid implementation, a software implementation of the proposed methods was made available in R.
The increasing availability of large combined datasets (or big data), such as those from electronic health records and from individual participant data meta-analyses, provides new opportunities and challenges for researchers developing and validating (including updating) prediction models. These datasets typically include individuals from multiple clusters (such as multiple centres, geographical locations, or different studies). Accounting for clustering is important to avoid misleading conclusions and enables researchers to explore heterogeneity in prediction model performance across multiple centres, regions, or countries, to better tailor or match them to these different clusters, and thus to develop prediction models that are more generalisable. However, this requires prediction model researchers to adopt more specific design, analysis, and reporting methods than standard prediction model studies that do not have any inherent substantial clustering. Therefore, prediction model studies based on clustered data need to be reported differently so that readers can appraise the study methods and findings, further increasing the use and implementation of such prediction models developed or validated from clustered datasets.