Clinical trial enrollment is a critical bottleneck in drug development: nearly 80% of trials fail to meet target enrollment, costing up to $8 million per day if delayed. In this work, we introduce a multimodal deep‐learning framework that not only predicts total participant count but also quantifies uncertainty around those predictions.
Challenges in Enrollment Forecasting
Traditional approaches fall into two camps:
- Deterministic models – e.g. tabular ML like XGBoost or LightGBM – which output a point estimate but ignore variability in recruitment rates.
- Stochastic models – e.g. Poisson or Poisson–Gamma processes – which simulate recruitment and give confidence intervals, but often struggle with high-dimensional, heterogeneous data.
Model Architecture
Inputs
- Key: structured features (phase, country, therapeutic area, sponsor, planned sites, target enrollment)
- Context: free-text (title, objectives, inclusion/exclusion criteria)
Text Embedding
We concatenate all text fields and encode with a pre-trained Clinical Longformer (max 4096 tokens).Multimodal Fusion
- Structured features pass through separate fully connected layers, yielding $z_{cat}$ and $z_{num}$.
- Text embedding $z_{emb}$ serves as queries in a multi‐head attention, with $z_{cat}$ and $z_{num}$ as keys/values.
Output (Deterministic)
A final dense layer produces a single point estimate $\hat{N}$ for total enrollment, achieving $R^2 \approx 0.76$ and MAE $\approx 52$ on held-out trials.
Modeling Uncertainty
To capture recruitment variability, we predict Gamma distribution parameters $(\alpha, \lambda)$ per site. Concretely:
$$ \mu \sim \mathrm{Gamma}(\alpha, \lambda), $$
and the number of enrollments in time $t$ follows a Poisson process with rate $\mu$. This yields 90 % confidence intervals covering 78.7 % of actual enrollments (median width ≈ 99 patients).
Predicting Recruitment Duration
Assuming inter‐arrival times are Poisson($\mu$) with $\mu\sim\mathrm{Gamma}(\alpha,\lambda)$, we:
- Infer $(\alpha,\lambda)$ and site startup delays.
- Simulate 1024 recruitment trajectories.
- Aggregate to forecast total duration.
Results vs. classic “fit & filter”:
- MAE: 7.52 vs 10.55 months
- 6-month CI coverage: 32.2 % vs 14.9 %
Data and Baselines
- Dataset: 11 400+ completed trials from IQVIA DQS & Citeline, split 9410/1000/1000 (train/dev/test).
- Baselines: XGBoost, LightGBM, BioBERT, ClinicalBERT, Llama 2 (7B, LoRA).
- Performance: Deterministic model improves MAE by ~9 % over LightGBM; stochastic model attains highest $R^2=0.77$.
Implications and Future Work
- Risk management: more reliable budgeting and site activation planning.
- Efficiency: inference in ~0.07 s vs 8.74 s for classical methods.
- Extensions: end-to-end LLM pipelines, real-time updating during ongoing trials, adaptation to regional recruitment dynamics.
Links
- Based on the publication 📄 arXiv:2507.23607 PDF