Skip to main content
All insights
AI Strategy

The AI Bottleneck in Clinical Development Is Context, Not Reasoning

May 31, 20267 min read

The most useful AI agent conversation is not about the agent.

It is about the context the agent needs in order to do useful work.

That point showed up clearly in a recent post from Mike Fishbein, who argued that after dozens of forward deployed engineering engagements, the hardest part of building AI agents and tools was context extraction. The phrase comes from a different market, but the lesson applies cleanly to clinical development.

In regulated R&D, the bottleneck is rarely whether a model can summarize, classify, draft, retrieve, or reason over a narrow prompt. Models can do enough of that to make technical feasibility a weak filter by itself.

The harder question is whether the organization can make the right clinical context available in a form the workflow can trust.

Context is not just data

Clinical teams often talk about data readiness as if it were enough. It is necessary. It is not sufficient.

An AI-enabled clinical workflow needs data, but it also needs the surrounding operating context:

  • Which source system is authoritative for this fact?
  • Which document is controlled, draft, superseded, or local?
  • Which role is allowed to take which action?
  • Which decision requires medical, clinical, regulatory, quality, or vendor input?
  • Which evidence must be retained for audit or inspection?
  • Which rule is procedural, which is regulatory, and which is local habit?
  • Which exception should be escalated instead of automated?

That context is usually scattered across CTMS, EDC, eTMF, safety systems, RTSM, analytics environments, SOPs, trackers, vendor portals, email, meeting notes, and the heads of experienced people.

An AI tool that cannot see that context either stays superficial or becomes risky.

The pilot hides the context problem

Pilots often work because the context is manually supplied around them.

Someone picks the right documents. Someone explains the workflow. Someone knows which outputs are plausible. Someone catches the exception. Someone translates the result into the language the business can use.

That is fine for learning. It is not a production operating model.

When the pilot moves toward a live clinical workflow, the context problem becomes visible. The team has to answer questions the demo could avoid:

  • Where does the agent get the controlled input?
  • How does it know which version is current?
  • What permissions does it inherit?
  • What happens when source systems disagree?
  • Who reviews the output?
  • What evidence trail is preserved?
  • How is performance monitored after deployment?

If those answers are reconstructed from scratch for every use case, the organization is not building AI capability. It is building custom demos.

Context extraction should become an asset

Every serious AI initiative should leave something reusable behind.

Not only a model prompt. Not only a workflow automation. Not only a vendor demo.

It should leave behind context assets the next initiative can use:

  • A workflow map showing the real process, decision points, handoffs, and exceptions
  • A source-system map showing where governed data and content live
  • A decision-rights map showing where human accountability belongs
  • An evidence map showing what must be retained for audit, inspection, or quality review
  • Evaluation cases that represent real clinical scenarios, edge cases, and failure modes
  • Validation assumptions that clarify what the tool is allowed to do and what it is not allowed to do

Those assets compound.

The first use case may be slow because the organization has to extract the context manually. The second should be faster. The third should reuse more of the same operating model. If every use case starts from zero, the AI program is learning less than it thinks.

This changes prioritization

AI roadmaps usually rank use cases by business value and technical feasibility. Those still matter.

For regulated clinical development, I would add a third criterion: context extractability.

Before selecting a use case, ask:

  • Is the workflow important enough to change?
  • Is the required data, content, and process context findable?
  • Can the accountable decision points be made explicit?
  • Can the evidence trail be preserved?
  • Can the controls be explained to quality, clinical, regulatory, and technology leaders?
  • Will the context assets be reusable for adjacent workflows?

A use case with moderate value but strong context reuse may be a better early investment than a flashy use case that requires a bespoke context build and teaches the organization little.

That is especially true in clinical development, where the same underlying context often matters across study startup, monitoring, data review, safety oversight, medical writing, vendor governance, and inspection readiness.

The work is cross-functional by nature

This is why AI delivery in life sciences cannot be treated as a pure engineering exercise.

The clinical function knows where the workflow breaks. Quality knows what has to be defensible. Regulatory knows what must survive external scrutiny. Data and technology teams know where the systems, integrations, permissions, and monitoring constraints sit. Vendors may know what is technically possible, but they cannot decide which organizational context matters most.

The useful work sits between those groups.

Someone has to extract the workflow. Someone has to name the decisions. Someone has to identify the source of truth. Someone has to separate rules from habits. Someone has to decide where automation is appropriate and where expert judgment remains accountable.

That is product work, operating model work, governance work, and implementation work at the same time.

The practical implication

If you are planning AI in clinical development, do not only ask, "What can the tool do?"

Ask, "What context would have to be made reliable for this tool to matter in production?"

That question will usually expose the real roadmap:

  • Which systems need to be connected
  • Which documents need better structure and lifecycle control
  • Which workflows need clearer decision rights
  • Which metrics prove the process changed
  • Which governance controls need to be designed before scale
  • Which context assets should be reused across the portfolio

AI agents may become more capable every quarter. That does not remove the enterprise work. It makes the enterprise work more valuable.

In clinical development, the strategic advantage will not come from having the most demos. It will come from making the organization's scientific, operational, and regulatory context usable enough that AI can safely change the work.

The model matters.

The context decides whether the model matters in production.

Source

Navigating this in your organization?

If turning AI pilots into production capability is on your agenda, I'm happy to have a direct conversation.

Book a discovery call