Nano decomposition
One agent, one output, one event. The pattern that replaces a twelve-field mega-prompt with twelve nanos that run in parallel, cache separately, and fail independently. We'll enrich a monster encounter across five axes to see it work.
The scenario
A new encounter just got added to a dungeon: "three bandits at a river crossing." Before it shows up on the DM's prep sheet, you want it enriched — threat tier, terrain, loot table, XP award, special abilities. Five axes, five short answers.
The instinct is to write one agent that returns all five fields in one JSON blob. It's less code. It's also the shape that breaks first, caches worst, and hides which axis went wrong. The alternative is to decompose.
The 1:1 rule
One agent produces one kind of answer. One nano per axis of enrichment, one structured output per nano, one event per result naming the fact that was established. If your output struct holds more than a couple of values serving different purposes, you probably need more nanos.
// The mega-agent instinct — one call, twelve outputs, everything coupled.
type EncounterDetails struct {
Threat string `json:"threat"` // trivial | minor | major | deadly
Terrain string `json:"terrain"` // forest | mountain | cave | city
LootTable string `json:"lootTable"` // bandit-a | dragon-hoard | ...
XP int `json:"xp"`
Abilities []string `json:"abilities"`
Difficulty int `json:"difficulty"` // 1–20
Surprise bool `json:"surprise"`
// ... and so on. All twelve axes share one cache key, one retry, one blast radius.
}
// The nano shape — one output per agent. Twelve agents.
type ThreatResult struct { Threat string `json:"threat"` }
type TerrainResult struct { Terrain string `json:"terrain"` }
type LootResult struct { LootTable string `json:"lootTable"` }
type XPResult struct { XP int `json:"xp"` }
// ... each axis its own agent, its own output type, its own cache entry.The nano tier — gpt-5-nano or equivalent — is
fast and cheap enough that "just make another one" is the
right move. Decomposition is the payoff, not the cost.
Why bother
Four things get better the moment the twelve-field blob becomes twelve nanos.
Independently retriable
One axis times out, you retry just that axis. In a mega-prompt, one timeout takes out all twelve outputs and you start the whole thing over.
Independently cacheable
A nano's inputs fit on a postcard — same encounter, same answer, cache hit. Mega-prompts rarely hit a cache because something in the 2,000-token input always shifts.
Independently parallelizable
N nanos run concurrently, bounded only by the rate limiter. A mega-prompt is one sequential call — parallelism caps at one.
Debuggable
Each nano has a typed input and a typed output you can inspect in isolation. Finding which axis of a twelve-output blob went wrong means diffing JSON. Finding which nano failed means reading the log line.
Decompose the events too
Decomposition only pays off if the scroll matches. Each nano writes its own event type; each topic names one kind of domain fact — a threat assessment, a terrain classification, a loot roll. They earn separate topics because they're different kinds of happenings, not because they each mutate a separate field somewhere downstream.
// RIGHT: one topic per kind of fact. Each nano's result is a distinct
// kind of judgment about the encounter — threat, terrain, loot, XP.
// They earn separate topics because a reactor that cares about threat
// has no reason to wake up for a loot roll.
event "encounter_threat_assessed" { encounterId string; threat string }
event "encounter_terrain_classified" { encounterId string; terrain string }
event "encounter_loot_rolled" { encounterId string; lootTable string }
event "encounter_xp_awarded" { encounterId string; xp int }
event "encounter_ability_added" { encounterId string; ability string }
// WRONG: one generic topic carrying several different kinds of fact
// behind a discriminator. Consumers have to switch on `axis` to
// recover what actually happened — the topic stopped routing.
// event "encounter_enriched" {
// axis string // "threat" | "terrain" | "loot" | "xp" | ...
// value string // stringly-typed, different shape per axis
// }The wrong version looks compact in the event definition,
but every downstream consumer now needs a switch on axis to decide what
the event means. That switch is covering for several
different kinds of fact sharing one topic — exactly the
anti-pattern the event page calls out.
Fan out in parallel
Nanos want concurrency. The standard shape is a sync.WaitGroup with one goroutine per axis,
each gated on a rate limiter, each log-and-continue on
failure.
// The fan-out skeleton — N nanos in parallel, one atomic event per result.
var wg sync.WaitGroup
for _, axis := range encounterAxes { // threat, terrain, loot, xp, abilities
wg.Add(1)
go func(axis enrichAxis) {
defer wg.Done()
// 1. Wait for the nano-tier rate limiter — gates throughput so
// parallel goroutines don't overwhelm the provider.
if err := WaitNano(ctx); err != nil {
return // context cancelled
}
// 2. Call the nano. The output is one small struct.
var out axis.resultType
_, err := w.Execute(ctx, axis.agent,
weave.Context{"encounter": encounter},
&out,
)
if err != nil {
// 3. Log and continue — one axis failing must not block siblings.
slog.Warn("nano failed", "agent", axis.agent.Name(), "err", err)
return
}
// 4. Emit one atomic event. If this axis is retried later,
// the append is idempotent by topic + payload.
axis.appendResult(ctx, dungeonScroll, encounter.ID, out)
}(axis)
}
wg.Wait()The four-step rhythm — wait, execute, log-and-return, emit — is the same in every enricher. The only thing that changes per axis is the agent, the output type, and the append call. That regularity is why this pattern scales across twenty enrichers without becoming a pile of idiosyncratic loops.
Rate limiting by tier
Parallelism without rate limiting overwhelms the provider. Two tiers of limiter match the two tiers of agent — nanos get their own budget, mini-tier reconcilers get another so they don't starve the nanos.
// Two tiers of rate limiter, matching the two tiers of agent.
// Nano — 25 rps, burst 5 (cheap, parallelizable enrichment)
// Mini — 10 rps, burst 3 (reasoning, reconciliation)
limiter := pipeline.NewLLMLimiter()
ctx = pipeline.WithLimiter(ctx, limiter)
// Every Execute call is preceded by the appropriate wait.
if err := WaitNano(ctx); err != nil { return err }
_, err := w.Execute(ctx, assessThreat, weave.Context{...}, &out)
// Mini-tier reconcilers use a separate wait to avoid starving the nanos.
if err := WaitMini(ctx); err != nil { return err }
_, err = w.Execute(ctx, reconcileQuest, weave.Context{...}, &verdict)
// Wait functions are no-ops when no limiter is on the context —
// tests don't need to install one.Today the limiter lives in application code and is injected
via context. Lifting it into the weave SDK so every Execute call is automatically gated by its
agent's tier is on the roadmap — until then, the WaitNano / WaitMini pattern is
the convention.
Why the scroll makes partial failure safe
Each nano's result lands as its own atomic event. A timeout on one axis leaves the other four happy on the scroll; retrying is surgical.
// What partial failure looks like on the scroll.
dungeon:castle-ravenfell
──────────────────────────────────────────
[42] encounter_threat_assessed { threat: "major" } // nano 1 ok
[43] encounter_terrain_classified { terrain: "cave" } // nano 2 ok
[44] encounter_loot_rolled { lootTable: "dragon-a" } // nano 3 ok
<nano 4 timed out — no event appended>
[45] encounter_ability_added { ability: "frightful" } // nano 5 ok
// Retry is surgical — re-run nano 4 against the same inputs and emit its
// event. Nothing upstream needs to care. This is what mega-prompt outputs
// cannot do: in a twelve-field blob, one timeout takes out all twelve.Contrast this with a mega-prompt: one timeout, the entire twelve-field output is lost, and your retry has to redo the eleven axes that already succeeded. The scroll's append-only grain is what makes per-axis retry the natural move instead of a heroic one.
When not to decompose
- The output fields genuinely co-depend. If axis B's answer must consider what axis A said, a single reasoning call (mini tier) is honest. Don't fake independence by splitting into nanos that then need each other's output to be correct.
- The axes share a single expensive input. If each nano needs to re-read a 10k-token document, you'll burn the document's token cost N times. Either cache at the prompt layer, or consolidate into one call and accept the coupling.
- You only have one axis. If the work actually is a single output, one nano is the whole plan. Don't add fan-out machinery for a single call.
What's next
- Agent concept — tiers
and the
WithOutputcontract that make nanos composable. - Intent-driven pipelines — where the fan-out pattern slots in as the "enricher" reactor role.
- Reading scrolls with folds — how an enricher reads the state of prior nanos without a projection query.