loader
publication

Innovation

Welcome to our research page featuring recent publications in the field of biostatistics and epidemiology! These fields play a crucial role in advancing our understanding of the causes, prevention, and treatment of various health conditions. Our team is dedicated to advancing the field through innovative studies and cutting-edge statistical analyses. On this page, you will find our collection of research publications describing the development of new statistical methods and their application to real-world data. Please feel free to contact us with any questions or comments.

Filter

Topic

History

Showing 1 of 10 publications

Evaluating individualized treatment effect predictions: A modelā€based perspective on discrimination and calibration assessment

In recent years, there has been a growing interest in the prediction of individualized treatment effects. While there is a rapidly growing literature on the development of such models, there is little literature on the evaluation of their performance. In this paper, we aim to facilitate the validation of prediction models for individualized treatment effects. The estimands of interest are defined based on the potential outcomes framework, which facilitates a comparison of existing and novel measures. In particular, we examine existing measures of discrimination for benefit (variations of the c-for-benefit), and propose model-based extensions to the treatment effect setting for discrimination and calibration metrics that have a strong basis in outcome risk prediction. The main focus is on randomized trial data with binary endpoints and on models that provide individualized treatment effect predictions and potential outcome predictions. We use simulated data to provide insight into the characteristics of the examined discrimination and calibration statistics under consideration, and further illustrate all methods in a trial of acute ischemic stroke treatment. The results show that the proposed model-based statistics had the best characteristics in terms of bias and accuracy. While resampling methods adjusted for the optimism of performance estimates in the development data, they had a high variance across replications that limited their accuracy. Therefore, individualized treatment effect models are best validated in independent data. To aid implementation, a software implementation of the proposed methods was made available in R.

Journal: Stat Med |
Year: 2024
Developing more generalizable prediction models from pooled studies and large clustered data sets

Prediction models often yield inaccurate predictions for new individuals. Large data sets from pooled studies or electronic healthcare records may alleviate this with an increased sample size and variability in sample characteristics. However, existing strategies for prediction model development generally do not account for heterogeneity in predictor-outcome associations between different settings and populations. This limits the generalizability of developed models (even from large, combined, clustered data sets) and necessitates local revisions. We aim to develop methodology for producing prediction models that require less tailoring to different settings and populations. We adopt internal-external cross-validation to assess and reduce heterogeneity in models' predictive performance during the development. We propose a predictor selection algorithm that optimizes the (weighted) average performance while minimizing its variability across the hold-out clusters (or studies). Predictors are added iteratively until the estimated generalizability is optimized. We illustrate this by developing a model for predicting the risk of atrial fibrillation and updating an existing one for diagnosing deep vein thrombosis, using individual participant data from 20 cohorts (N = 10 873) and 11 diagnostic studies (N = 10 014), respectively. Meta-analysis of calibration and discrimination performance in each hold-out cluster shows that trade-offs between average and heterogeneity of performance occurred. Our methodology enables the assessment of heterogeneity of prediction model performance during model development in multiple or clustered data sets, thereby informing researchers on predictor selection to improve the generalizability to different settings and populations, and reduce the need for model tailoring. Our methodology has been implemented in the R package metamisc.

Journal: Stat Med |
Year: 2021
Citation: 17
Current Trends in the Application of Causal Inference Methods to Pooled Longitudinal Observational Infectious Disease Studies - A Protocol for a Methodological Systematic Review

Introduction: Pooling (or combining) and analysing observational, longitudinal data at the individual level facilitates inference through increased sample sizes, allowing for joint estimation of study- and individual-level exposure variables, and better enabling the assessment of rare exposures and diseases. Empirical studies leveraging such methods when randomization is unethical or impractical have grown in the health sciences in recent years. The adoption of so-called "causal" methods to account for both/either measured and/or unmeasured confounders is an important addition to the methodological toolkit for understanding the distribution, progression, and consequences of infectious diseases (IDs) and interventions on IDs. In the face of the Covid-19 pandemic and in the absence of systematic randomization of exposures or interventions, the value of these methods is even more apparent. Yet to our knowledge, no studies have assessed how causal methods involving pooling individual-level, observational, longitudinal data are being applied in ID-related research. In this systematic review, we assess how these methods are used and reported in ID-related research over the last 10 years. Findings will facilitate evaluation of trends of causal methods for ID research and lead to concrete recommendations for how to apply these methods where gaps in methodological rigor are identified.

Methods and analysis: We will apply MeSH and text terms to identify relevant studies from EBSCO (Academic Search Complete, Business Source Premier, CINAHL, EconLit with Full Text, PsychINFO), EMBASE, PubMed, and Web of Science. Eligible studies are those that apply causal methods to account for confounding when assessing the effects of an intervention or exposure on an ID-related outcome using pooled, individual-level data from 2 or more longitudinal, observational studies. Titles, abstracts, and full-text articles, will be independently screened by two reviewers using Covidence software. Discrepancies will be resolved by a third reviewer. This systematic review protocol has been registered with PROSPERO (CRD42020204104).

Journal: PLoS One |
Year: 2021
Citation: 3
How well can we assess the validity of non-randomised studies of medications? A systematic review of assessment tools

Objective: To determine whether assessment tools for non-randomised studies (NRS) address critical elements that influence the validity of NRS findings for comparative safety and effectiveness of medications.

Design: Systematic review and Delphi survey.

Data sources: We searched PubMed, Embase, Google, bibliographies of reviews and websites of influential organisations from inception to November 2019. In parallel, we conducted a Delphi survey among the International Society for Pharmacoepidemiology Comparative Effectiveness Research Special Interest Group to identify key methodological challenges for NRS of medications. We created a framework consisting of the reported methodological challenges to evaluate the selected NRS tools.

Study selection Checklists or scales assessing NRS.

Data extraction: Two reviewers extracted general information and content data related to the prespecified framework.

Results: Of 44 tools reviewed, 48% (n=21) assess multiple NRS designs, while other tools specifically addressed case-control (n=12, 27%) or cohort studies (n=11, 25%) only. Response rate to the Delphi survey was 73% (35 out of 48 content experts), and a consensus was reached in only two rounds. Most tools evaluated methods for selecting study participants (n=43, 98%), although only one addressed selection bias due to depletion of susceptibles (2%). Many tools addressed the measurement of exposure and outcome (n=40, 91%), and measurement and control for confounders (n=40, 91%). Most tools have at least one item/question on design-specific sources of bias (n=40, 91%), but only a few investigate reverse causation (n=8, 18%), detection bias (n=4, 9%), time-related bias (n=3, 7%), lack of new-user design (n=2, 5%) or active comparator design (n=0). Few tools address the appropriateness of statistical analyses (n=15, 34%), methods for assessing internal (n=15, 34%) or external validity (n=11, 25%) and statistical uncertainty in the findings (n=21, 48%). None of the reviewed tools investigated all the methodological domains and subdomains.

Conclusions: The acknowledgement of major design-specific sources of bias (eg, lack of new-user design, lack of active comparator design, time-related bias, depletion of susceptibles, reverse causation) and statistical assessment of internal and external validity is currently not sufficiently addressed in most of the existing tools. These critical elements should be integrated to systematically investigate the validity of NRS on comparative safety and effectiveness of medications.

Systematic review protocol and registration: https://osf.io/es65q.

Journal: BMJ Open |
Year: 2021
Citation: 7
Real-time imputation of missing predictor values improved the application of prediction models in daily practice

Objectives: In clinical practice, many prediction models cannot be used when predictor values are missing. We therefore propose and evaluate methods for real-time imputation.

Study design and Setting: We describe (i) mean imputation (where missing values are replaced by the sample mean), (ii) joint modeling imputation (JMI, where we use a multivariate normal approximation to generate patient-specific imputations) and (iii) conditional modeling imputation (CMI, where a multivariable imputation model is derived for each predictor from a population). We compared these methods in a case study evaluating the root mean squared error (RMSE) and coverage of the 95% confidence intervals (i.e. the proportion of confidence intervals that contain the true predictor value) of imputed predictor values.

Results: RMSE was lowest when adopting JMI or CMI, although imputation of individual predictors did not always lead to substantial improvements as compared to mean imputation. JMI and CMI appeared particularly useful when the values of multiple predictors of the model were missing. Coverage reached the nominal level (i.e. 95%) for both CMI and JMI.n

Conclusion: Multiple imputation using, either CMI or JMI, is recommended when dealing with missing predictor values in real time settings.

Journal: J Clin Epidemiol |
Year: 2021
Citation: 20
Prognostic models for chronic kidney disease: a systematic review and external validation

Background: Accurate risk prediction is needed in order to provide personalized healthcare for chronic kidney disease (CKD) patients. An overload of prognosis studies is being published, ranging from individual biomarker studies to full prediction studies. We aim to systematically appraise published prognosis studies investigating multiple biomarkers and their role in risk predictions. Our primary objective was to investigate if the prognostic models that are reported in the literature were of sufficient quality and to externally validate them.

Methods: We undertook a systematic review and appraised the quality of studies reporting multivariable prognosis models for end-stage renal disease (ESRD), cardiovascular (CV) events and mortality in CKD patients. We subsequently externally validated these models in a randomized trial that included patients from a broad CKD population.

Results: We identified 91 papers describing 36 multivariable models for prognosis of ESRD, 50 for CV events, 46 for mortality and 17 for a composite outcome. Most studies were deemed of moderate quality. Moreover, they often adopted different definitions for the primary outcome and rarely reported full model equations (21% of the included studies). External validation was performed in the Multifactorial Approach and Superior Treatment Efficacy in Renal Patients with the Aid of Nurse Practitioners trial (n = 788, with 160 events for ESRD, 79 for CV and 102 for mortality). The 24 models that reported full model equations showed a great variability in their performance, although calibration remained fairly adequate for most models, except when predicting mortality (calibration slope >1.5).

Conclusions: This review shows that there is an abundance of multivariable prognosis models for the CKD population. Most studies were considered of moderate quality, and they were reported and analysed in such a manner that their results cannot directly be used in follow-up research or in clinical practice.

Journal: Nephrol Dial Transplant |
Year: 2020
Citation: 11
Handling missing predictor values when validating and applying a prediction model to new patients

Missing data present challenges for development and real-world application of clinical prediction models. While these challenges have received considerable attention in the development setting, there is only sparse research on the handling of missing data in applied settings. The main unique feature of handling missing data in these settings is that missing data methods have to be performed for a single new individual, precluding direct application of mainstay methods used during model development. Correspondingly, we propose that it is desirable to perform model validation using missing data methods that transfer to practice in single new patients. This article compares existing and new methods to account for missing data for a new individual in the context of prediction. These methods are based on (i) submodels based on observed data only, (ii) marginalization over the missing variables, or (iii) imputation based on fully conditional specification (also known as chained equations). They were compared in an internal validation setting to highlight the use of missing data methods that transfer to practice while validating a model. As a reference, they were compared to the use of multiple imputation by chained equations in a set of test patients, because this has been used in validation studies in the past. The methods were evaluated in a simulation study where performance was measured by means of optimism corrected C-statistic and mean squared prediction error. Furthermore, they were applied in data from a large Dutch cohort of prophylactic implantable cardioverter defibrillator patients.

Journal: Stat Med |
Year: 2020
Citation: 25
Development and validation of a novel prediction model to identify patients in need of specialized trauma care during field triage: design and rationale of the GOAT study

BACKGROUND: Adequate field triage of trauma patients is crucial to transport patients to the right hospital. Mistriage and subsequent interhospital transfers should be minimized to reduce avoidable mortality, life-long disabilities, and costs. Availability of a prehospital triage tool may help to identify patients in need of specialized trauma care and to determine the optimal transportation destination.

METHODS: The GOAT (Gradient Boosted Trauma Triage) study is a prospective, multi-site, cross-sectional diagnostic study. Patients transported by at least five ground Emergency Medical Services to any receiving hospital within the Netherlands are eligible for inclusion. The reference standards for the need of specialized trauma care are an Injury Severity Score ≥ 16 and early critical resource use, which will both be assessed by trauma registrars after the final diagnosis is made. Variable selection will be based on ease of use in practice and clinical expertise. A gradient boosting decision tree algorithm will be used to develop the prediction model. Model accuracy will be assessed in terms of discrimination (c-statistic) and calibration (intercept, slope, and plot) on individual participant's data from each participating cluster (i.e., Emergency Medical Service) through internal-external cross-validation. A reference model will be externally validated on each cluster as well. The resulting model statistics will be investigated, compared, and summarized through an individual participant's data meta-analysis.

DISCUSSION: The GOAT study protocol describes the development of a new prediction model for identifying patients in need of specialized trauma care. The aim is to attain acceptable undertriage rates and to minimize mortality rates and life-long disabilities.

Journal: Diagn Progn Res |
Year: 2019
Citation: 11
Evidence synthesis in prognosis research

Over the past few years, evidence synthesis has become essential to investigate and improve the generalizability of medical research findings. This strategy often involves a meta-analysis to formally summarize quantities of interest, such as relative treatment effect estimates. The use of meta-analysis methods is, however, less straightforward in prognosis research because substantial variation exists in research objectives, analysis methods and the level of reported evidence.

We present a gentle overview of statistical methods that can be used to summarize data of prognostic factor and prognostic model studies. We discuss how aggregate data, individual participant data, or a combination thereof can be combined through meta-analysis methods. Recent examples are provided throughout to illustrate the various methods.

Journal: Diagn Progn Res |
Year: 2019
Citation: 19
A framework for meta-analysis of prediction model studies with binary and time-to-event outcomes

It is widely recommended that any developed - diagnostic or prognostic - prediction model is externally validated in terms of its predictive performance measured by calibration and discrimination. When multiple validations have been performed, a systematic review followed by a formal meta-analysis helps to summarize overall performance across multiple settings, and reveals under which circumstances the model performs suboptimal (alternative poorer) and may need adjustment. We discuss how to undertake meta-analysis of the performance of prediction models with either a binary or a time-to-event outcome. We address how to deal with incomplete availability of study-specific results (performance estimates and their precision), and how to produce summary estimates of the c-statistic, the observed:expected ratio and the calibration slope. Furthermore, we discuss the implementation of frequentist and Bayesian meta-analysis methods, and propose novel empirically-based prior distributions to improve estimation of between-study heterogeneity in small samples. Finally, we illustrate all methods using two examples: meta-analysis of the predictive performance of EuroSCORE II and of the Framingham Risk Score. All examples and meta-analysis models have been implemented in our newly developed R package "metamisc".

Journal: Stat Methods Med Res |
Year: 2018
Citation: 109