Evaluating eligibility criteria of oncology trials using real-world data and artificial intelligence

May 2021 Cancer trials Jolien Blokken

To date, overly restrictive eligibility criteria for clinical trials often limit the access of cancer patients to potentially beneficial treatments. Interestingly, however, recent real-world data within the computational framework of Trial Pathfinder revealed that many common criteria only had minimal effect on the trial hazard ratios while the pool of eligible patients on average more than doubled upon these less restrictive criteria.

Expert opinion

“Clinical trials measure the highest achievable survival of a very selected group of patients, whereas public health data coming from cancer registers measure the average survival achieved by all patients with cancer. The only way to change policy standards using real-world data is to have high quality data from a representative set of patients to ensure generalisability. This suggests the need for a global registry with curated data like in a clinical trial. We must develop real-world models using matched oncology clinical trials and real-world evidence to inform policymakers to make the best reimbursement choices. In addition, there is an important need to capture the safety and efficacy in the population that will use the drug after approval.”

Overly restrictive, and sometimes poorly justified, eligibility criteria are a key barrier for a suboptimal recruitment in clinical trials. Moreover, these restrictive trials do not fully capture the efficacy and safety of the drug in the populations that will eventually use the treatment after approval.1,2 However, how to broaden eligibility remains a major challenge. Therefore, data-driven algorithms combined with real-world data can potentially be used to improve several aspects of clinical trials.3-5 In order to assess the potential of this strategy, the effect of relaxing specific eligibility criteria on treatment efficacy and cohort size in a real-world population was evaluated by using the computational framework of Trial Pathfinder. More specifically, Trial Pathfinder was used to emulate completed trials of advanced non-small-cell lung cancer (aNSCLC) using data from a nationwide database of electronic health records (Flatiron Health Database) comprising 61,094 patients with aNSCLC.

Trial Pathfinder

Trial Pathfinder was developed as a framework that integrates real-world data and systematically analyses the hazard ratio of the overall survival for cohorts that are defined by different eligibility criteria. In the first step – trial emulation – individuals in the real-world dataset who met the available eligibility criteria as originally published in the clinical trial protocol were selected. Selected patients were subsequently assigned to the treatment groups that were consistent with their treatment records in the Flatiron database. Next, survival analysis for the emulated trials using the hazard ratio (HR) of overall survival (OS) as outcome were performed. The Trial Pathfinder emulation framework makes it possible to systematically vary the eligibility criteria in silico and quantify how the HR of OS changes with different combinations of criteria.6

Effects of the eligibility criteria

The effects of the eligibility criteria were assessed by estimating the HR of OS with propensity scores to control for differences between groups. As such, this analysis corresponds to the hypothetical setting in which the eligibility criteria were fully relaxed. Next, researchers emulated each aNSCLC trial using all of the original protocol criteria that can be encoded in the Flatiron database. Interestingly, only 30% of the patients in the Flatiron database who were treated drugs tested in the trial actually met the trial eligibility criteria. Nevertheless, across the trials, the HR of the full patient population was comparable to, and sometimes even smaller than, the HR of the subset of the patients who did meet the eligibility criteria. These results suggest that many patients who were excluded by the restrictive eligibility criteria of the trial can potentially benefit from treatment. Further analysis demonstrated that several commonly used inclusion/exclusion criteria do not substantially affect the HR of OS of a trial or even reduce the efficacy of the trial. These criteria include conditions analysed by laboratory tests (blood pressure, albumin levels, lymphocyte or neutrophil counts, or alanine aminotransferase (ALT), alkaline phosphatase (ALP) and aspartate aminotransferase (AST) levels) as well as previous treatments (ALK, PDL1, EGFR and CYP34A therapies, systemic or antineoplastic therapies).6

When using only the subset of the criteria that Trial Pathfinder identified to decrease the hazard ratio of the trial and relax the remaining restrictions (so-called ‘data-driven criteria’), on average nine inclusion/exclusion criteria could be removed. Remarkably, the HR of the OS had an average reduction of 0.05 compared with using the full eligibility criteria, and the number of eligible patients more than doubled. As such, these results indicate that clinical trials could be more inclusive for diverse populations by standardising and potentially broaden several eligibility criteria. In addition, trials with more relaxed thresholds of laboratory values for eligibility did not have more treatment withdrawals due to adverse events compared with trials with more stringent eligibility thresholds and this holds true for different types of cancer.6


  1. Food and Drug Administration. https://www.fda.gov/regulatory-information/ search-fda-guidance-documents/enhancingdiversity-clinical-trial-populations- eligibility-criteria-enrollment-practices-and-trial (2020).
  2. Van Spall HG, et al. J Am Med Assoc. 2007;297:1233-40.
  3. Labrecque JA, et al. Eur J Epidemiol. 2017;32:473-5.
  4. Danaei G, et al. J Clin Epidemiol. 2018;96:12-22.
  5. Woo M. Nature. 2019;573:S100-2.
  6. Liu R, et al. Nature. 2021