🔗 See also: Maestro — ardeshir.io/maestro/

Continuation Architecture

Beyond RAG, Chat, and the Agent Hype Cycle

A continuation can sleep for three weeks waiting for an arXiv RSS event, wake, do five seconds of reasoning, spawn two children, and go back to sleep — without anyone “running” it.

This single sentence breaks every assumption baked into the current agent stack. There is no session. There is no “context window management.” There is no orchestrator polling in a loop. The unit of work is not a request-response pair — it is a continuation: a resumable, branchable, budgeted computation that persists across arbitrary time horizons.

What follows is the architecture that makes this real.

1. The Context Field (Replacing RAG)

Instead of a retrieval call per turn, each continuation has an associated context field — a continuously-maintained, ranked projection of all available memory sources, recomputed by background workers whenever:

the continuation’s goal frame shifts
new data lands in any subscribed source
a sibling continuation publishes a finding

The field is materialized lazily. The continuation doesn’t “query” it; it just reads the top of the field, which is always fresh.

Implementation

A streaming materialized view (Materialize, RisingWave, or a custom Flink job) over your vector + structured + graph stores
Ranked by a small, fast model (the “context kernel”) that scores relevance to the current goal frame
With eviction-by-decay rather than fixed windows — old context fades unless reinforced

This replaces the four-layer memory hierarchy with one continuously-projected surface. The big model never sees memory; it sees the field.

2. Reactive Tool Mesh (Replacing MCP-as-RPC)

Tools register two interfaces: a command surface (do this now) and a stream surface (notify me when X). Continuations declare interest in streams; the mesh routes events.

Underneath this can be NATS JetStream, Kafka with schema registry, or a CRDT-based gossip layer for federated work across organizations.

The crucial difference from MCP

Tools push, not just pull. A “literature monitor” tool isn’t called — it’s subscribed to, and it wakes continuations when relevant papers appear.

MCP remains useful at the edges — as the protocol tools speak when they’re called imperatively. But the default mode is reactive, not request-response.

3. The Researcher Pane (Replacing Chat UI)

The interface isn’t a chat window. It’s closer to a research notebook crossed with an air-traffic-control display:

Left rail — live continuations: what’s currently thinking, what’s sleeping, what’s waiting on you
Center pane — findings: published artifacts from completed sub-goals
Right rail — the context field: what the system currently thinks is relevant
Inline controls — fork, kill, re-budget, or merge continuations

The researcher operates on the swarm, not through a model.

4. The Economics Layer

This is the part nobody is doing well yet.

Every continuation carries a budget vector — not just tokens, but a portfolio:

Resource	Examples
Cheap model calls	Haiku-class, small rankers
Expensive model calls	Sonnet-class, Opus-class
Tool quotas	API calls, search, code execution
Human-attention quotas	Review requests, approvals
Real-money spend	Paid APIs, cloud compute

A policy kernel (small, fast, deterministic) decides at each step:

Should this continuation use Haiku-class, Sonnet-class, or Opus-class reasoning right now?
Should it run synchronously or yield and come back?
Should it spawn a child or do the work inline?
Is the marginal value of the next token positive given the budget?

This is what makes “small models + big models together” actually work as a discipline rather than a vibe. The economics layer is a first-class control plane, not an afterthought.

5. How This Maps to Real Infrastructure

Open Source Build

Component	Implementation
Continuation store	Temporal or Restate
Context field	RisingWave + Qdrant + small ranking model via vLLM
Reactive mesh	NATS JetStream
Policy kernel	Rust service
LLM dispatch	LiteLLM
Event log	Kafka or Redpanda

Azure Foundry Build

Component	Implementation
Continuations	Durable Functions or Foundry Agent threads
Context field	Azure AI Search + Cosmos DB change feed + Foundry-hosted small ranker
Reactive mesh	Event Grid + Service Bus
Policy kernel	Container App
Observability	App Insights + Foundry tracing

AWS EKS Build

Component	Implementation
Continuations	Step Functions Express + DynamoDB
Context field	OpenSearch + Kinesis Data Streams + Bedrock-hosted small ranker
Reactive mesh	EventBridge + MSK
Policy kernel	Lambda or Fargate service
Tool workers	EKS with Karpenter for elasticity

Key Insight

On all three stacks, MCP is still useful at the edges — as the protocol tools speak when they’re called imperatively. But the architecture’s center of gravity has shifted from request-response to event-driven continuations with continuous context projection.

The agent doesn’t chat. It thinks — across days, across models, across budgets — and the human steers the swarm.

🔗 Explore the full Maestro framework: ardeshir.io/maestro/

Published on Sepahsalar.org — All Research