Clinical trial enrollment is a critical bottleneck in drug development: nearly 80% of trials fail to meet target enrollment, costing up to $8 million per day if delayed. In this work, we introduce a multimodal deep‐learning framework that not only predicts total participant count but also quantifies uncertainty around those predictions.

Challenges in Enrollment Forecasting

Traditional approaches fall into two camps:

  • Deterministic models – e.g. tabular ML like XGBoost or LightGBM – which output a point estimate but ignore variability in recruitment rates.
  • Stochastic models – e.g. Poisson or Poisson–Gamma processes – which simulate recruitment and give confidence intervals, but often struggle with high-dimensional, heterogeneous data.

Model Architecture

  1. Inputs

    • Key: structured features (phase, country, therapeutic area, sponsor, planned sites, target enrollment)
    • Context: free-text (title, objectives, inclusion/exclusion criteria)
  2. Text Embedding
    We concatenate all text fields and encode with a pre-trained Clinical Longformer (max 4096 tokens).

  3. Multimodal Fusion

    • Structured features pass through separate fully connected layers, yielding $z_{cat}$ and $z_{num}$.
    • Text embedding $z_{emb}$ serves as queries in a multi‐head attention, with $z_{cat}$ and $z_{num}$ as keys/values.
  4. Output (Deterministic)
    A final dense layer produces a single point estimate $\hat{N}$ for total enrollment, achieving $R^2 \approx 0.76$ and MAE $\approx 52$ on held-out trials.

Modeling Uncertainty

To capture recruitment variability, we predict Gamma distribution parameters $(\alpha, \lambda)$ per site. Concretely:

$$ \mu \sim \mathrm{Gamma}(\alpha, \lambda), $$

and the number of enrollments in time $t$ follows a Poisson process with rate $\mu$. This yields 90 % confidence intervals covering 78.7 % of actual enrollments (median width ≈ 99 patients).

Predicting Recruitment Duration

Assuming inter‐arrival times are Poisson($\mu$) with $\mu\sim\mathrm{Gamma}(\alpha,\lambda)$, we:

  1. Infer $(\alpha,\lambda)$ and site startup delays.
  2. Simulate 1024 recruitment trajectories.
  3. Aggregate to forecast total duration.

Results vs. classic “fit & filter”:

  • MAE: 7.52 vs 10.55 months
  • 6-month CI coverage: 32.2 % vs 14.9 %

Data and Baselines

  • Dataset: 11 400+ completed trials from IQVIA DQS & Citeline, split 9410/1000/1000 (train/dev/test).
  • Baselines: XGBoost, LightGBM, BioBERT, ClinicalBERT, Llama 2 (7B, LoRA).
  • Performance: Deterministic model improves MAE by ~9 % over LightGBM; stochastic model attains highest $R^2=0.77$.

Implications and Future Work

  • Risk management: more reliable budgeting and site activation planning.
  • Efficiency: inference in ~0.07 s vs 8.74 s for classical methods.
  • Extensions: end-to-end LLM pipelines, real-time updating during ongoing trials, adaptation to regional recruitment dynamics.