Two months after OpenAI declared the architecture for enterprise AI, Anthropic just confirmed it from the opposite direction. Stepwise execution — boxes and arrows, tools and sandboxes — is being productized. But the harder problem, the execution that requires genuine understanding, is where the real battle is just beginning. And it’s a battle the model providers can’t win alone.
When OpenAI launched Frontier, I wrote that it was less a product announcement and more an architectural declaration — a signal of where the entire enterprise AI industry is heading. Three ideas stood out: LLM-powered orchestration as a first-class infrastructure layer, a semantic layer to harmonize knowledge and data, and a comprehensive control tower for the agent lifecycle.
This week, Anthropic launched Claude Managed Agents. The other shoe has dropped. And the architecture is now clear.
Two strategies, one destination
What makes this moment significant isn’t that Anthropic shipped a good product — they did. It’s that two companies with fundamentally different go-to-market philosophies have independently converged on the same architectural conclusion.
Frontier is an enterprise-down play. It starts with the CIO, the org chart, and the compliance team. Agents as “AI coworkers” who need onboarding, institutional knowledge, and identity management. It’s the HR metaphor applied to software.
Managed Agents is a developer-up play. It starts with the engineering team that’s already three months into building an agent and drowning in infrastructure work — sandboxing, state management, credential handling, and error recovery. Anthropic looked at that pain and said: Stop. We’ll handle it. Here’s an API, here’s the pricing, go build.
Different entry points. Different buyers. Different language. But strip away the positioning and both announcements share the same core conviction: agents in production demand centralized governance, security, and control.
Frontier calls it “enterprise security and governance” — scoped agent identities, auditable actions, and Identity and Access Management (IAM) that spans humans and AI. Managed Agents calls it “trusted governance” — scoped permissions, execution tracing, credential management baked into the runtime. When two companies approaching the market from opposite ends arrive at the same requirement, that’s not marketing alignment. That’s a real architectural truth surfacing from production experience.
The wild west phase of agent building is over.
The layer they can’t build
Both announcements solve workflow execution. Neither solves business comprehension.
When an agent processes an invoice where line items don’t match the PO, it’s not an orchestration problem—it’s a comprehension problem. The agent needs to know that your company defines “margin” differently from GAAP. That approval chains changed last quarter for one division. That your vendor master has three naming conventions for the same supplier.
That context doesn’t live in OpenAI’s cloud or Anthropic’s runtime. It lives inside your enterprise—institutional, specific, and constantly evolving. Which means the semantic layer is the one layer that cannot be managed by a model provider.
At Sema4.ai, we’ve been building this infrastructure from day one: translating business documents, data sources, and line-of-business applications into context agents can work with, with full fidelity and accuracy. What started as SOP-style runbooks has evolved into a rich tapestry that powers the data-centric, judgment-heavy work across finance, procurement, and supply chain.
Frontier and Managed Agents are productizing workflow execution. That makes the semantic layer more valuable, not less.
Workflow execution, productized. YAWN.
From the start, at Sema4 we resisted building YAWN — Yet Another Workflow Navigator. This week’s announcement reinforces why that was the right bet, even when it felt like the risky one.
Anthropic’s announcement goes further than governance. What they’ve actually shipped is a comprehensive execution engine for stepwise agentic work — replacing boxes-and-arrows with you define tasks, tools, and guardrails, and the agent runs through the loop. It’s worth understanding the components because of what they displace.
The orchestration engine manages the agent loop — when to call tools, how to handle context, how to recover from errors. Before this, developers were building this with LangGraph, CrewAI, or hand-rolled chains. Anthropic is saying: we know how Claude reasons best, let us manage that loop. They claim up to 10 points better task success on hard problems compared to a standard prompting loop. If that holds, the case for external orchestration frameworks narrows considerably for teams already on Claude.
The agent harness wraps task management, evaluation, and multi-agent coordination around that loop. Agents can define outcomes, self-evaluate, spin up other agents to parallelize work. This replaces the observability and evaluation tooling teams have been stitching together from multiple vendors.
The code sandbox provides secure, isolated, persistent execution. Agents write and run code, work with files, use tools — all without touching production systems.
Three capabilities, bundled together at $0.08 per session-hour. Notion is running parallel task agents inside workspaces. Rakuten deployed specialist agents across five departments, each within a week. Sentry chained root-cause analysis directly into a Claude-powered agent that writes the fix and opens the PR — shipped in weeks instead of months.
These are real production workflows, not demo-day showcases. And for the class of work that can be decomposed into well-defined steps with clear tool boundaries — code generation, document processing, task routing — this is a genuinely transformative capability.
It also raises a bigger question about enterprise workflow economics. When Rakuten deploys specialist agents in a week and Sentry ships end-to-end workflows in weeks instead of months, the comparison to a six-month ServiceNow implementation at $100+ per seat is hard to ignore. If a managed agent runtime handles orchestration and governance, and a semantic layer captures the business process knowledge, what’s left for the traditional workflow platform? That’s not our battle — but it’s a question every enterprise software buyer is about to start asking.
But not all enterprise work fits in boxes and arrows.
Beyond boxes and arrows
Both Frontier and Managed Agents have made strong moves on workflow execution. Governance, sandboxing, orchestration, persistence — for stepwise, tool-calling workflows, these problems are being solved. But there’s a class of enterprise work that doesn’t reduce to boxes and arrows, and it’s the class that matters most.
Consider what actually happens in a finance reconciliation. An agent doesn’t just march through steps — it needs to interpret an invoice where the line items don’t match the PO because the vendor used different unit descriptions. It needs to understand that your company defines “margin” differently from the GAAP standard. It needs to know which approval chain applies to a PO above $50K, and that the threshold changed last quarter for one division but not another. It needs to reason about ambiguous data, apply business judgment, and adapt the workflow based on what it finds.
That’s not an orchestration problem. That’s a comprehension problem. And no amount of sandboxing or session persistence solves it.
This is the semantic layer. And to be clear: this layer doesn’t exist in opposition to the LLM — it’s deeply dependent on it. The reasoning capabilities that Claude Opus, Open AI gpt-5-x, and Gemini have unlocked are what make a semantic layer possible in the first place. Without the model’s ability to interpret, infer, and reason, domain context is just static documentation.
But here’s what a model provider can’t do: they can’t know your business. They can’t know that your organization restructured its approval chains last quarter, or that your vendor master has three different naming conventions for the same supplier, or that “net 30” means something slightly different in your APAC contracts than in your North American ones. That context lives inside the enterprise. It’s institutional, specific, and constantly evolving.
Which means the semantic layer is the one layer that cannot be managed by anyone other than the business itself. OpenAI can’t host it. Anthropic can’t abstract it away. It has to be built, maintained, and governed by the people who understand the work — with the right infrastructure to make that practical.
At Sema4.ai, that infrastructure is what we’ve been building — translating domain understanding of business documents, data sources, and line-of-business applications into something AI agents can work with, with full fidelity and accuracy. What started as SOP-style runbooks has evolved into a rich tapestry of skills, workflows, tools, and their combinations, powering data-centric agentic use cases across finance, procurement, and supply chain operations.
The more workflow execution engines that enter the market — Frontier, Managed Agents, and whatever comes next — the more the semantic layer becomes the critical dependency. It’s what separates agents that can follow a workflow from agents that understand the work.
The architecture is converging. We’re building the layer that makes it work.
Two months ago I said the industry was converging on a shared blueprint. This week confirmed it. The model layer is commoditizing. The stepwise execution layer is being productized. What remains is the hardest and most valuable piece: the intelligent execution that comes from teaching agents what the work actually means.
That’s where we live. And we’re just getting started.