From Pilot Sprawl to Production: How Enterprise Leaders are Scaling AI

Many enterprise AI initiatives fail not because the technology is flawed, but because they fall into the “pilot trap.” Companies often launch dozens of disconnected experiments—a phenomenon known as “pilot sprawl”—that never transition into actual business value.

Recent insights from technology leaders at MassMutual and Mass General Brigham (MGB) reveal how organizations are overcoming this hurdle. By replacing uncoordinated experimentation with rigorous governance, clear metrics, and strategic integration, these companies are turning AI potential into measurable productivity.

MassMutual: The Scientific Approach to Value

For MassMutual, a 175-year-old financial services institution, AI is no longer an experiment; it is a production reality. The company has successfully integrated AI into underwriting, claims, customer service, and IT. The results are striking:
Developer productivity has increased by 30%.
IT help desk resolution times dropped from 11 minutes to just one.
Customer service call durations were slashed from 15 minutes to roughly two minutes.

Defining Success Through Metrics

Sears Merritt, MassMutual’s head of enterprise technology and experience, emphasizes that every AI project must begin with a hypothesis. Rather than chasing trends, the team asks: What problem are we solving, and how will we prove it?

To avoid the friction of constant readjustment, MassMutual mandates that “success” must be defined by business partners before a tool ever reaches production. This ensures that technical implementation aligns with actual departmental needs.

Building for Flexibility

A key strategy for MassMutual is avoiding “vendor lock-in.” Because the AI landscape changes so rapidly, the company maintains a highly heterogeneous environment. They have built common service layers and APIs that sit between their AI models and their core systems (including legacy mainframes).

This architecture allows them to swap out a “best-of-breed” model for a newer, better one without rebuilding their entire infrastructure from scratch.


Mass General Brigham: Pruning the “Thousand Flowers”

While MassMutual focused on rigorous measurement, Mass General Brigham (MGB) had to undergo a process of consolidation. For years, MGB’s 15,000 researchers utilized various machine learning tools, leading to a fragmented landscape of unmanaged pilots.

Moving from Sprawl to Strategy

CTO Nallan “Sri” Sriraman describes their initial approach as a “thousand flowers bloom” methodology—essentially letting everyone experiment freely. However, he realized they didn’t have a thousand flowers; they had a few dozen disconnected attempts that lacked direction.

To fix this, MGB pivoted toward a more centralized strategy:
Leveraging Existing Platforms: Instead of building in-house tools, MGB began prioritizing AI features already being rolled out by their primary vendors (such as Epic, Microsoft, and ServiceNow).
Strategic “Planting”: Rather than letting anyone experiment, they now “carefully plant and nourish” AI initiatives by embedding AI champions within specific business groups.
Controlled Environments: They use “small landing zones” to test sophisticated products safely, allowing for controlled testing of token usage and model behavior.

The Non-Negotiable Guardrails

In a healthcare setting, the stakes for AI error are life-altering. MGB has implemented strict “human-in-the-loop” protocols. For instance, while AI may assist in generating radiology reports, a physician must always review and sign off on the final decision.

Furthermore, the technical safeguards are absolute:
Data Privacy: Strict rules prevent Protected Health Information (PHI) from being entered into public AI tools.
The “Kill Switch”: Every operational AI system must have a “big red button” to shut it down immediately if it malfunctions.
Observability: Real-time dashboards monitor for “model drift” (when an AI’s accuracy degrades over time) to ensure safety and reliability.


The Bottom Line

The transition from AI experimentation to AI production requires a shift in mindset: moving from “can we do this?” to “how does this drive value, and how do we govern it?”

Whether through MassMutual’s scientific rigor or MGB’s disciplined consolidation, the lesson is clear: successful enterprise AI is not about the number of pilots you run, but the discipline with which you scale them.