AI ROI in Life Sciences Starts With the Process, Not the Pilot

A company full of AI pilots can still be standing still.

At a conference recently, a speaker described their company as an airport because it had so many pilots. The line was clever. It was also uncomfortably familiar. Life sciences companies have spent the last few years proving that AI can summarize a document, retrieve information, draft content, classify an item, or automate a follow-up. A lot of those pilots worked.

The question R&D and IT leaders have to answer now is harder: did the work get better, or did we just put a faster step inside the same process?

Pilots were a reasonable place to start. You cannot learn how your data behaves, where your governance gaps sit, or what your business teams will trust by keeping AI on a strategy slide. Learning matters. Return appears when a process that matters to drug development becomes faster, more capable, or better controlled in production.

That requires a different conversation than "where can we apply AI?"

It starts with the work we are willing to change.

We are teaching AI to carry waste faster

A senior leader told me recently that her team was sick of hearing her say they needed to reimagine their processes.

She should keep saying it.

Too many AI efforts begin by looking at an existing task and asking how a model can do that task faster. That approach feels practical. It is bounded. It is measurable. It is also how you automate around the very friction that should have been challenged in the first place.

Site training is a simple example.

You define the required curriculum. You assign it to site staff. You track completion. You send reminders to the people who have not completed it. If a person completed comparable training on a prior study, they submit evidence and someone reviews it to decide whether an exemption should be granted.

I have seen companies apply AI to automate the follow-up. The site gets a better reminder. The sponsor spends less time chasing completion.

That is useful. It also leaves the basic process alone.

What if the first question were not how to follow up faster? What if it were whether you needed to assign the training at all?

Prior training evidence already exists in controlled records for at least some site staff. AI could help retrieve that evidence before assignment, compare it to defined exemption criteria, surface candidates for review, and preserve the supporting trail. The accountable person still makes the decision where the decision requires judgment. The site simply avoids being asked to complete work the sponsor already has evidence it completed.

The reminder is not where the meaningful value sits. The unnecessary assignment is.

That distinction matters well beyond training. R&D processes are full of people retrieving information already held somewhere else, moving content between systems, reconstructing histories, reviewing every item because we have not designed a reliable way to focus attention on the exceptions. If we use AI only to speed up those individual actions, the process inherits every limitation it had before the pilot began.

Regulated work is where the hesitation begins

There is a reason companies gravitate to safer administrative use cases. Nobody wants to be the leader explaining an AI-supported action that cannot be defended in an audit or a submission. Nobody knows exactly how every regulator will evaluate every future AI use. Data quality matters. Hallucinated output matters. Validation, accountability, traceability, and monitoring all matter.

That hesitation is reasonable. Treating it as a permanent stopping point is not.

There is a large, valuable space between an autonomous model making unchecked decisions and people continuing to perform every manual activity forever: AI retrieving controlled evidence, preparing a draft, identifying an exception, applying pre-defined logic, or assembling the information an accountable expert needs to make a better decision.

The FDA and EMA have already put shape around that direction in their principles for good AI practice in drug development. The language is familiar to anyone who has worked in regulated technology: a clear context of use, risk-based assessment, data governance and documentation, lifecycle management, multidisciplinary oversight, and information that can be understood by the people responsible for it.

There is no free pass for AI here. There is enough direction to begin using it responsibly.

Leaders who wait for the risk to disappear will wait forever. The work is to decide which risks can be controlled well enough to allow a valuable process to change.

Production does not require removing the expert

Merck gave us a useful production example in June 2025. The company reported that an internal generative AI platform was used to create and submit live clinical study reports. For fully human-reviewed first drafts, Merck reported that preparation time fell from an average of 180 hours to 80 hours. The first-draft timeline went from two to three weeks down to three to four days. Measured draft errors fell by 50 percent.

The time reduction gets the headline. The work around it deserves the attention: table preprocessing, data extraction, styling, validation, revamped operations, trained teams, and rigorous oversight by qualified medical writers.

The medical writer remained. The wasted part of the writer's time did not have to remain with them.

That is what production value looks like in a regulated workflow. A qualified person is still accountable for the record. The organization has changed what that person must spend time doing to reach an accountable result.

We need to stop treating human-in-the-loop as proof that the old operating model has to remain untouched. Human review may be appropriate for a long time in clinical development. It may be appropriate permanently for certain actions. But keeping human judgment does not mean keeping every manual retrieval step, every duplicate review, every handoff, and every existing staffing assumption.

If AI prepares evidence and the human evaluates it, the process has changed.

If AI is added and the human repeats the entire old process behind it, the company has purchased a new cost without creating much new capacity.

ROI has to show up in the work R&D is there to do

ROI conversations get weak when they start with token costs or hours saved in an isolated task. Hours saved belong in the calculation. They cannot carry it alone.

Within R&D, AI value should show up as speed, productive capacity, or controlled risk.

Speed matters because development time has a consequence. A faster review or a faster prepared document earns its value when it removes delay from work that can constrain development execution. Not every use case has to claim that it brings a therapy to market earlier by itself. That is too easy to overstate. But it should be possible to describe how it removes a real constraint from the work that gets a therapy there.

Capacity may be the more honest return for many AI investments. We keep reaching for "cheaper" because it fits a business case template. Drug development has no shortage of worthwhile work. If a clinical team can support more studies, if a data team can complete more analyses, or if an R&D organization can pursue more promising opportunities without increasing resources at the same rate, it has built a larger development engine.

AI introduces risk. It can also reduce risk when it makes evidence easier to find, applies a controlled process more consistently, identifies missing information earlier, or directs expert attention toward the cases that actually require judgment. In a regulated environment, more reliable and more defensible work has economic value even when the spreadsheet struggles to price it.

This is why the early ROI can be disappointing even when the application has merit. During adoption, companies keep the old review, add the new review, preserve the staffing, add governance, and pay for the integration and validation needed to move beyond the demo. They are running two operating models at the same time.

The model call is the easy cost to see. Production also requires data access, infrastructure, integration, validation, monitoring, quality oversight, training, support, and ownership before the result can be trusted in the workflow.

You may not know the full cost at the time you select the project. You do have to admit that the cost exists.

Choose the capability before you choose the pilot

R&D and IT cannot select useful AI projects from a list of what a model can do. Models can do enough things now that technical feasibility is no longer a meaningful filter by itself.

Start with the capability the company needs to improve.

Map the work before you automate it. Separate what is rules-based, what AI can prepare or route, and what still requires accountable judgment. Then measure the delay, risk, or capacity constraint the current process creates.

Where is development work being delayed? Where are skilled people reconstructing information that already exists? Where is the organization unable to take on more work without adding resources at the same rate? Where could a better-controlled process expose risk sooner or make an accountable decision easier to defend?

Those questions come from the business. The answers cannot become production without IT.

Business leaders know where the work breaks and which outcome matters. IT leaders know whether the data is accessible, whether the systems can support the workflow, what validation will require, what monitoring is possible, and whether the operating model can scale. Vendors can help prove what is technically possible. They cannot decide which organizational capability is worth changing.

A useful project is tied to a goal R&D cares about. It sits inside a repeatable process important enough to change. It has a baseline, even when the eventual value will take time to fully see. It has data that can be accessed and governed. It has a route into production that the accountable people are willing to own.

A viable AI use case is not automatically a valuable one. If it does not improve a capability the organization needs, it is a demonstration.

If AI works, roles will change

We also need to be more honest about the workforce conversation.

Saying AI will free every person to perform higher-value work is comforting. It is not a plan. Where a function is largely performing administrative work and the demand for that work is finite, automation will eventually reduce the number of people required to perform it. Pretending otherwise delays the preparation those people and their leaders deserve.

R&D carries a different opportunity. The demand for valuable drug development work is not exhausted. There are more candidates that could be investigated, more studies that could be run, more evidence that could be examined, and more decisions that deserve expert attention than organizations have capacity to pursue today.

Productivity can lower cost. In R&D, it can also expand the work that matters.

That still changes jobs. It raises the value of people who can apply judgment, interpret evidence, oversee controlled processes, and question whether the workflow itself is good enough. It should also force leaders to develop those capabilities deliberately rather than reassuring everyone that nothing material will change.

The point of AI in R&D cannot be to preserve every task we currently perform. The point is to increase the work we can responsibly accomplish.

The pilot was the beginning of the decision

The airport line works because every leader recognizes the comfort of a pilot. A pilot lets you experiment without deciding what has to change when the experiment succeeds.

We have learned enough to make that decision.

Keep using pilots to learn. Use them to test data, understand controls, build confidence, and find where the process fails. Then be willing to move toward the regulated workflows where the return is harder to earn and far more meaningful once it is earned.

R&D and IT leaders do not need perfect certainty. They need a shared view of which capabilities matter, what production will cost, where human accountability belongs, and which manual work no longer deserves protection.

AI does not become strategic because a pilot succeeded. It becomes strategic when it changes a capability R&D needs to bring more therapies forward, faster and with control.

Until then, you may have a very busy airport.

Sources

Merck, "Merck Expands Innovative Internal Generative AI Solutions Helping to Deliver Medicines to Patients Faster," June 25, 2025.
FDA and EMA, "Guiding Principles of Good AI Practice in Drug Development," January 2026.