Deep Learning-based Prediction of Clinical Trial Enrollment with Uncertainty Estimates

Clinical trial enrollment is a critical bottleneck in drug development: nearly 80% of trials fail to meet target enrollment, costing up to $8 million per day if delayed. In this work, we introduce a multimodal deep‐learning framework that not only predicts total participant count but also quantifies uncertainty around those predictions.

Challenges in Enrollment Forecasting

Traditional approaches fall into two camps:

Deterministic models – e.g. tabular ML like XGBoost or LightGBM – which output a point estimate but ignore variability in recruitment rates.
Stochastic models – e.g. Poisson or Poisson–Gamma processes – which simulate recruitment and give confidence intervals, but often struggle with high-dimensional, heterogeneous data.

Model Architecture

Inputs
- Key: structured features (phase, country, therapeutic area, sponsor, planned sites, target enrollment)
- Context: free-text (title, objectives, inclusion/exclusion criteria)
Text Embedding
We concatenate all text fields and encode with a pre-trained Clinical Longformer (max 4096 tokens).
Multimodal Fusion
- Structured features pass through separate fully connected layers, yielding $z_{cat}$ and $z_{num}$.
- Text embedding $z_{emb}$ serves as queries in a multi‐head attention, with $z_{cat}$ and $z_{num}$ as keys/values.
Output (Deterministic)
A final dense layer produces a single point estimate $\hat{N}$ for total enrollment, achieving $R^2 \approx 0.76$ and MAE $\approx 52$ on held-out trials.

Modeling Uncertainty

To capture recruitment variability, we predict Gamma distribution parameters $(\alpha, \lambda)$ per site. Concretely:

$$ \mu \sim \mathrm{Gamma}(\alpha, \lambda), $$

and the number of enrollments in time $t$ follows a Poisson process with rate $\mu$. This yields 90 % confidence intervals covering 78.7 % of actual enrollments (median width ≈ 99 patients).

Predicting Recruitment Duration

Assuming inter‐arrival times are Poisson($\mu$) with $\mu\sim\mathrm{Gamma}(\alpha,\lambda)$, we:

Infer $(\alpha,\lambda)$ and site startup delays.
Simulate 1024 recruitment trajectories.
Aggregate to forecast total duration.

Results vs. classic “fit & filter”:

MAE: 7.52 vs 10.55 months
6-month CI coverage: 32.2 % vs 14.9 %

Data and Baselines

Dataset: 11 400+ completed trials from IQVIA DQS & Citeline, split 9410/1000/1000 (train/dev/test).
Baselines: XGBoost, LightGBM, BioBERT, ClinicalBERT, Llama 2 (7B, LoRA).
Performance: Deterministic model improves MAE by ~9 % over LightGBM; stochastic model attains highest $R^2=0.77$.

Implications and Future Work

Risk management: more reliable budgeting and site activation planning.
Efficiency: inference in ~0.07 s vs 8.74 s for classical methods.
Extensions: end-to-end LLM pipelines, real-time updating during ongoing trials, adaptation to regional recruitment dynamics.

Challenges in Enrollment Forecasting#

Model Architecture#

Modeling Uncertainty#

Predicting Recruitment Duration#

Data and Baselines#

Implications and Future Work#

Links#