🔗 Companion project: Maestro — ardeshir.io/maestro/

Barnyard: Agentic Architecture Beyond the Graph

The Architectural Shift

The biggest architectural shift in Agentic AI during 2025–2026 is that teams stopped treating the LLM as “the application” and started treating it as a reasoning kernel inside a distributed systems architecture.

Modern agentic systems are now built more like operating systems or workflow runtimes:

Layer	Role
LLM	Planner / reasoner
Context layer	Memory + retrieval + session graph
Tool layer	APIs, MCP servers, browsers, terminals
Orchestrator	State machine / graph runtime
Execution fabric	Kubernetes, serverless, queues, workers
Observability	Traces, replay, audit, evaluation

The core problem — how do we continuously process large amounts of data without blowing up context windows or timeouts? — is now solved through a combination of context engineering, stateful orchestration, retrieval pipelines, memory compression, streaming execution graphs, event-driven agent runtimes, and vector + structured cache systems.

The New Agentic Stack (2026)

Foundation Models Become Reasoning Engines

The modern architecture no longer pushes all data directly into GPT-4.5 or Claude Opus. Instead, the model sees only:

Current task
Compressed memory
Retrieved knowledge
Active tool state
Execution history

Everything else lives outside the context window.

This is why OpenAI, Anthropic, Microsoft, and Google are all converging on:

MCP (Model Context Protocol)
Externalized memory
Tool-native orchestration
Persistent sessions

…rather than “huge prompt engineering.”

The industry realized: context windows are expensive RAM, not databases.

The Most Important Evolution: Context Engineering

This is now the real discipline replacing classic prompt engineering. The modern pipeline:

User Request
     ↓
Task Planner Agent
     ↓
Context Retrieval Layer
     ↓
Memory Compression Layer
     ↓
Tool Invocation Layer
     ↓
Execution Graph Runtime
     ↓
Model Reasoning Step
     ↓
Checkpoint + Persist State
     ↓
Continue / Spawn / Delegate

Why Old Agent Architectures Failed

2024-era systems failed because they:

Shoved everything into prompts — no external memory, no retrieval, no compression
Retried endlessly — no budget awareness, no graceful degradation
Lacked durable state — every session was ephemeral, every crash was total

MCP: Wire Format, Not Architecture

We’re not throwing it out. We’re saying it’s the wire format, not the architecture.

MCP gives you a standard way to expose tools and context sources. That’s valuable. But it doesn’t give you:

Continuations that sleep and resume
Budget-aware scheduling
Context field projections
Durable, migratable state

Those require a runtime. That’s Barnyard.

What Becomes Possible

If Barnyard works, a researcher’s day looks like this:

They describe a hypothesis once. Twenty continuations spawn. Eighteen go to sleep waiting on data sources. Two start working immediately on tractable sub-questions. Over three weeks, findings accumulate in the notebook. The researcher reviews, prunes, redirects.

The system never “times out” because nothing is synchronous. Context never “blows up” because the model only ever sees a thin field projection. Costs stay bounded because the policy kernel enforces the budget vector.

This is genuinely different from “LangGraph on Kubernetes.” It treats agentic work the way operating systems treat processes, the way databases treat queries, and the way markets treat capital — as a continuous, budgeted, schedulable activity rather than a function call.

Caveats

A few things worth being skeptical about:

The context field idea is the riskiest piece. Continuously re-ranking everything against a moving goal frame is expensive, and the ranking model becomes a critical dependency. Worth prototyping in isolation before committing.

Continuation migration sounds elegant but is notoriously hard in practice. Erlang got it right; almost nobody else has. Start with non-migratable continuations pinned to durable workflows and earn migration later.

The economics layer requires good cost/value estimates per step, which is an open research problem. Start with simple heuristics and instrument heavily.

The industry has converged on MCP + graphs for a reason — it’s good enough for most use cases. Barnyard is a bet that “good enough” leaves a lot of researcher productivity on the table. That bet is defensible, but it’s a bet.

🔗 Explore the full Maestro project: ardeshir.io/maestro/