The Meta Work: why most agent projects fail before a line of code is written

Category: Partner Strategy
Tags: ai, agents, partner-strategy, governance, continuous-improvement, managed-services, copilot

MIT published a report last year called The GenAI Divide: State of AI in Business 2025. The headline finding has been quoted so often it's become background noise, which is a shame, because the number deserves a second look.

95% of enterprise generative AI pilots deliver zero measurable return. Not "modest" or "hard to quantify". Zero. Yes, the methodology has critics, and even halved the number is damning.

If you're a Microsoft Partner and your agent practice is built around designing, building, and implementing agents, that statistic should bother you. Because the research behind it is clear about why those projects fail. Model quality isn't the problem. Copilot Studio isn't the problem. The technology mostly works.

What fails is everything around the technology. The planning that decides which process to agent-ify first. The governance that keeps an agent estate from turning into shadow AI. And the continuous improvement work that makes an agent still useful six months after go-live, instead of rotting in place while the business process changes around it.

Flowchart contrasting what gets scoped and delivered in a typical agent engagement (agent design, Copilot Studio build, integration and deploy) with what actually decides success (governance, planning, continuous improvement). Both streams feed into agent go-live, then a six-month value check in which roughly 5% of projects realise ROI and 95% stall. Flowchart contrasting what gets scoped and delivered in a typical agent engagement (agent design, Copilot Studio build, integration and deploy) with what actually decides success (governance, planning, continuous improvement). Both streams feed into agent go-live, then a six-month value check in which roughly 5% of projects realise ROI and 95% stall.
The visible work (left) is what most partners price and deliver. The meta work (right) is what decides whether the project survives six months.

I'm going to call this the meta work. And the question I want to put on the table for the channel is: who owns it? Because right now, mostly nobody does. And that's why the 95% number is as ugly as it is.

What the meta work actually is

Governance is the policy, registry, and control layer. Which agents exist, who owns each one, what data they can touch, what happens when they get it wrong, who signs off on a new one, and how you retire the old. Most customers are still operating without this. Gartner's July 2024 forecast put it plainly: "At least 30% of generative AI projects will be abandoned after proof of concept by the end of 2025, due to poor data quality, inadequate risk controls, escalating costs or unclear business value." Three of those four reasons are governance problems.

Planning is the work that happens before anyone writes a prompt. Which business process is a genuine candidate? What does good look like? What's the kill criterion if it isn't working? How does this fit the roadmap? BCG's Where's the Value in AI? report found that only 4% of companies have cutting-edge AI capability across functions. The other 96% aren't struggling because they picked the wrong model. They're struggling because nobody did the planning.

Continuous Improvement is the loop that keeps an agent earning its keep: observability, evaluation, human feedback, drift detection, re-tuning, lifecycle retirement. This is the AgentOps discipline the industry is still inventing. It's the equivalent of what MLOps became for traditional machine learning, and it's the thing customers almost never have in-house. Without it, an agent is a point-in-time artefact. With it, the agent gets better every quarter.

None of this is build work. All of it determines whether the build work was worth doing.

The ownership vacuum

When you ask customers who owns the meta work, you get hand-waving. When you ask partners, you also get hand-waving. The honest answer across most engagements I've seen is: nobody. It falls into the gap between the statement of work and the assumed-in operations.

Three positions exist in the channel, and each one has a steelman worth taking seriously.

The customer owns it. This is the traditional system-integrator view. It's their business, their risk appetite, their process. Partners who try to own governance and CI end up running the customer's business for them, which is a weird place to be commercially. The argument has force. Governance in particular is not something a partner can impose.

The partner owns it. This is the managed-services view, and it's the one the MIT data actually supports. Customers don't have AgentOps muscle. They don't have the observability tooling, the evaluation frameworks, or the specialist skills. Expecting them to build all of that while also running their day job is exactly how you end up in the 95%. If the partner doesn't own it, in practice nobody does.

It's shared, and that's the hardest one to price. Governance sits with the customer because only they can set risk appetite. CI needs both sides: the partner runs the platform, the customer owns the business context that tells the platform what "better" means. Planning is genuine co-creation. This is probably the right answer. It's also commercially awkward. Who gets the blame when an agent drifts and nobody spotted it? Shared accountability is easy to say and hard to write into a contract.

I don't think there's a single right answer. But I think there's a wrong one, which is the default position most engagements are sitting in today: implicit shared ownership, which in practice means no ownership. And that gets you RAND's finding that over 80% of enterprise AI projects fail, roughly double the rate of non-AI IT projects.

Before any pricing conversation, there's a simpler question every partner should answer for themselves: do we have these as services at all? Forget whether customers will pay. Just ask: if a customer requested agent governance design next week, is there a named offer in our catalogue, with a lead consultant, a methodology, and a deliverable? Or would we have to invent it on the way to the meeting?

Six services beyond the build

The point of this post isn't that partners should stop implementing agents. Implementation is the easy bit to package and it should keep happening. The point is that implementation is one of seven things a customer needs, and if the other six aren't in your portfolio, you're leaving the customer to improvise them. That's how stalls happen.

Here's the shortlist. Treat it as a capability-portfolio inventory rather than a menu to price.

  1. Agent governance design. Registry, lifecycle policy, responsible AI controls, approval workflows. A defined engagement with a clear deliverable and an artefact the customer keeps.

  2. Use case selection and value hypothesis. The "which process do we agent-ify first?" workshop, with real ROI models and kill criteria. This is what most customers are missing. McKinsey's 2025 survey found 88% of organisations use AI somewhere, but only 39% report enterprise-level EBIT impact. The gap is almost entirely about picking the right use cases.

  3. AgentOps platform services. The shared observability, evaluation, and feedback infrastructure. Managed service, recurring engagement, relatively well-bounded scope. This is the closest analogue to the SOC model most partners already understand.

  4. Continuous evaluation. Scheduled red-teaming, regression testing against business KPIs, prompt and model drift reviews. A quarterly cadence. Lighter to stand up than a full AgentOps platform and often a useful entry point into the operational stack.

  5. Change and adoption. Prompt engineering guilds, Copilot champion programmes, prompt libraries, executive communication. This is where most partners underinvest because it feels soft. It's also where the MIT data says the biggest ROI swings sit.

  6. Agent lifecycle reviews. Periodic "is this agent still earning its keep?" audits. The decommissioning conversation nobody wants to have but every customer needs. Low effort, high stickiness.

Quadrant chart of services plotted on two axes — one-off engagement to recurring service on the horizontal, advisory to operational on the vertical. Governance design and use case selection sit in the one-off advisory quadrant. Change and adoption and lifecycle reviews sit in recurring advisory. Continuous evaluation and AgentOps platform sit in recurring operational. The build is plotted as a reference point in one-off operational. Quadrant chart of services plotted on two axes — one-off engagement to recurring service on the horizontal, advisory to operational on the vertical. Governance design and use case selection sit in the one-off advisory quadrant. Change and adoption and lifecycle reviews sit in recurring advisory. Continuous evaluation and AgentOps platform sit in recurring operational. The build is plotted as a reference point in one-off operational.
A healthy agent practice has capability in every quadrant. Most partners today cluster in the left side — one-off builds — and wonder why the customer relationship cools after go-live.

A healthy agent practice has capability in every quadrant. Most partners today are concentrated in the left side, selling one-off builds, and wondering why the customer relationship cools after go-live. It cools because there's nothing else in the catalogue to continue the conversation with.

Some partners will read that list and say "we already do most of that." Good. But is it a named service with a methodology, a lead, and an artefact? Or is it improvised each time a customer asks, drawn from the goodwill of whichever consultant happens to be on site? That's the difference between a portfolio and a habit. A portfolio scales. A habit doesn't.

The opportunity for distributors

There are more than 400,000 Microsoft partners worldwide. When you look at the channel as a whole, it makes no sense for even 10% of them to be standing up their own AgentOps platform, drafting their own governance methodology, or building their own evaluation harness. That's 40,000 partners reinventing the same six wheels in parallel, badly, while their customers wait.

This is where distributors have a genuine opening. Not as a logistics layer for licences, which is the role most of the channel still mentally files them under, but as the wholesale provider of the meta work that the long tail of partners cannot, or should not, build in-house.

Think about what that looks like as a portfolio:

  • A wholesale AgentOps platform. The observability, evaluation, and drift-detection stack, white-labelled for partners to wrap and resell. One distributor builds it once. Thousands of partners use it. Customers get a mature product instead of whichever home-grown dashboard their partner could throw together between projects.
  • A governance methodology in a box. Registry templates, RAI control libraries, lifecycle policies, approval workflows. Branded to the partner, built by people who do it full-time, kept current as the regulatory landscape moves.
  • A continuous evaluation service. Quarterly red-teaming and KPI regression delivered as a wholesale subscription. The partner owns the customer relationship and the business context. The distributor owns the test infrastructure and the specialists who run it.
  • A shared change and adoption library. Champion programme materials, prompt libraries, executive briefing decks. The kind of asset every partner needs and almost none can afford to build to a high standard for one customer at a time.

The economics here are the interesting part. A managed AgentOps platform is expensive to build and cheap to operate per tenant. That's a textbook distribution play: the cost sits with one party, the value gets shared across many. The same is true for a current, well-maintained governance library. Build once, license at scale, update centrally when something changes.

For the partner, the trade is straightforward. You stop trying to be world-class at six things you'll never get full utilisation on, and you focus on the two or three you can genuinely lead with: the customer relationship, the business-context expertise, the build itself. The rest you wholesale, mark up, and deliver under your own brand.

For the distributor, the opportunity is bigger than it looks. The partners who currently buy CSP licences from you are exactly the partners who will be losing customers in 2027 because their agent estate stalled. If you can offer them a wholesale meta-work portfolio that lets them keep those customers alive past the six-month mark, you're not just protecting your CSP base. You're moving up the value chain into a recurring services revenue line that is materially harder for a hyperscaler to disintermediate than a transactional licence.

I'd like to see the major distributors take this seriously. The honest assessment today is that most of them aren't there yet. There's investment in marketplace tooling, some interesting work in pre-sales enablement, and the usual partner training programmes. But a productised, wholesale meta-work portfolio? I haven't seen one I'd point a partner at and say "buy that, resell it, you're done." If a distributor reading this has one, I'd genuinely like to know.

The 95% number doesn't move until the long tail of the channel has access to mature meta-work services without having to build them. Distributors are the only layer of the channel structurally positioned to provide that at scale. Whether they take the opportunity is a different question.

What this really changes

The 95% number at the top of this post isn't a technology indictment. It's a portfolio indictment. The industry has built an agent services offer around the one thing that's easiest to scope, ship, and sign off on: the build. Everything that determines whether the build was worth doing sits in the white space around it, and until those capabilities are first-class offers in partner catalogues, the 95% won't move.

So the provocation is this. The partners who end up on the right side of the GenAI Divide aren't the ones with the deepest Copilot Studio skills, or the tightest Azure OpenAI integrations, or the best pre-sales demos. They're the ones who treat governance, planning, and continuous improvement as core service categories with the same discipline they already apply to build. As products in their own right, not value-adds or post-sale goodwill. Some partners will build that capability themselves. Most won't, and shouldn't, which is why the distributor question matters as much as the partner one.

That's a harder practice to stand up than a build team. It needs different skills, different tooling, different leadership. It takes longer to get right. It also compounds, which the build work doesn't. A build engagement ends. A governance or AgentOps relationship keeps generating new engagements off the back of the one before.

Most partner strategy decks I see are still weighted towards "more build capacity." I think that's the wrong bet for the next three years.

Tomorrow morning

Gartner's June 2025 follow-up predicts that over 40% of agentic AI projects will be cancelled by the end of 2027. That forecast is not about the technology letting people down. It's about the meta work being skipped.

Pick one of the six service categories above. Pick the one where the capability gap in your practice is biggest, not the one that's easiest to sell. Name an owner. Give them a week to draft a service definition, a methodology, and a deliverable. Pitch it internally before it goes anywhere near a customer. You'll learn more from one honest internal review than from six months of external market research.

If you're a distributor, the equivalent ask is harder and more interesting. Pick the one wholesale meta-work service you could productise in the next two quarters and put a named owner against it. The partners who'll need it are already on your CSP ledger.

The agents will keep getting built. The question is whether they'll keep being useful. And that answer is almost never determined by the build. It's determined by everything around it.

That's the work. And it needs an owner.


Sources

  1. MIT Project NANDA. The GenAI Divide: State of AI in Business 2025. https://nanda.media.mit.edu/ai_report_2025.pdf
  2. Gartner press release, 29 July 2024. Gartner Predicts 30% of Generative AI Projects Will Be Abandoned After Proof of Concept by End of 2025. https://www.gartner.com/en/newsroom/press-releases/2024-07-29-gartner-predicts-30-percent-of-generative-ai-projects-will-be-abandoned-after-proof-of-concept-by-end-of-2025
  3. Gartner press release, 25 June 2025. Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027. https://www.gartner.com/en/newsroom/press-releases/2025-06-25-gartner-predicts-over-40-percent-of-agentic-ai-projects-will-be-canceled-by-end-of-2027
  4. Boston Consulting Group. Where's the Value in AI? (October 2024). https://www.bcg.com/publications/2024/wheres-value-in-ai
  5. RAND Corporation. Ryseff, J., De Bruhl, B. F., and Newberry, S. J. The Root Causes of Failure for Artificial Intelligence Projects and How They Can Succeed (2024). https://www.rand.org/pubs/research_reports/RRA2680-1.html
  6. McKinsey & Company. The State of AI: Global Survey 2025 — Agents, innovation, and transformation. https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai