Guide · 4

Nano decomposition

One agent, one output, one event. The pattern that replaces a twelve-field mega-prompt with twelve nanos that run in parallel, cache separately, and fail independently. We'll enrich a monster encounter across five axes to see it work.

The scenario

A new encounter just got added to a dungeon: "three bandits at a river crossing." Before it shows up on the DM's prep sheet, you want it enriched — threat tier, terrain, loot table, XP award, special abilities. Five axes, five short answers.

The instinct is to write one agent that returns all five fields in one JSON blob. It's less code. It's also the shape that breaks first, caches worst, and hides which axis went wrong. The alternative is to decompose.

The 1:1 rule

One agent produces one kind of answer. One nano per axis of enrichment, one structured output per nano, one event per result naming the fact that was established. If your output struct holds more than a couple of values serving different purposes, you probably need more nanos.

// The mega-agent instinct — one call, twelve outputs, everything coupled.
type EncounterDetails struct {
    Threat       string   `json:"threat"`        // trivial | minor | major | deadly
    Terrain      string   `json:"terrain"`       // forest | mountain | cave | city
    LootTable    string   `json:"lootTable"`     // bandit-a | dragon-hoard | ...
    XP           int      `json:"xp"`
    Abilities    []string `json:"abilities"`
    Difficulty   int      `json:"difficulty"`    // 1–20
    Surprise     bool     `json:"surprise"`
    // ... and so on. All twelve axes share one cache key, one retry, one blast radius.
}

// The nano shape — one output per agent. Twelve agents.
type ThreatResult  struct { Threat string  `json:"threat"` }
type TerrainResult struct { Terrain string `json:"terrain"` }
type LootResult    struct { LootTable string `json:"lootTable"` }
type XPResult      struct { XP int         `json:"xp"` }
// ... each axis its own agent, its own output type, its own cache entry.

The nano tier — gpt-5-nano or equivalent — is fast and cheap enough that "just make another one" is the right move. Decomposition is the payoff, not the cost.

Why bother

Four things get better the moment the twelve-field blob becomes twelve nanos.

Independently retriable

One axis times out, you retry just that axis. In a mega-prompt, one timeout takes out all twelve outputs and you start the whole thing over.

Independently cacheable

A nano's inputs fit on a postcard — same encounter, same answer, cache hit. Mega-prompts rarely hit a cache because something in the 2,000-token input always shifts.

Independently parallelizable

N nanos run concurrently, bounded only by the rate limiter. A mega-prompt is one sequential call — parallelism caps at one.

Debuggable

Each nano has a typed input and a typed output you can inspect in isolation. Finding which axis of a twelve-output blob went wrong means diffing JSON. Finding which nano failed means reading the log line.

Decompose the events too

Decomposition only pays off if the scroll matches. Each nano writes its own event type; each topic names one kind of domain fact — a threat assessment, a terrain classification, a loot roll. They earn separate topics because they're different kinds of happenings, not because they each mutate a separate field somewhere downstream.

// RIGHT: one topic per kind of fact. Each nano's result is a distinct
// kind of judgment about the encounter — threat, terrain, loot, XP.
// They earn separate topics because a reactor that cares about threat
// has no reason to wake up for a loot roll.
event "encounter_threat_assessed"    { encounterId string; threat string }
event "encounter_terrain_classified" { encounterId string; terrain string }
event "encounter_loot_rolled"        { encounterId string; lootTable string }
event "encounter_xp_awarded"         { encounterId string; xp int }
event "encounter_ability_added"      { encounterId string; ability string }

// WRONG: one generic topic carrying several different kinds of fact
// behind a discriminator. Consumers have to switch on `axis` to
// recover what actually happened — the topic stopped routing.
// event "encounter_enriched" {
//     axis  string  // "threat" | "terrain" | "loot" | "xp" | ...
//     value string  // stringly-typed, different shape per axis
// }

The wrong version looks compact in the event definition, but every downstream consumer now needs a switch on axis to decide what the event means. That switch is covering for several different kinds of fact sharing one topic — exactly the anti-pattern the event page calls out.

Fan out in parallel

Nanos want concurrency. The standard shape is a sync.WaitGroup with one goroutine per axis, each gated on a rate limiter, each log-and-continue on failure.

// The fan-out skeleton — N nanos in parallel, one atomic event per result.
var wg sync.WaitGroup
for _, axis := range encounterAxes {    // threat, terrain, loot, xp, abilities
    wg.Add(1)
    go func(axis enrichAxis) {
        defer wg.Done()

        // 1. Wait for the nano-tier rate limiter — gates throughput so
        //    parallel goroutines don't overwhelm the provider.
        if err := WaitNano(ctx); err != nil {
            return // context cancelled
        }

        // 2. Call the nano. The output is one small struct.
        var out axis.resultType
        _, err := w.Execute(ctx, axis.agent,
            weave.Context{"encounter": encounter},
            &out,
        )
        if err != nil {
            // 3. Log and continue — one axis failing must not block siblings.
            slog.Warn("nano failed", "agent", axis.agent.Name(), "err", err)
            return
        }

        // 4. Emit one atomic event. If this axis is retried later,
        //    the append is idempotent by topic + payload.
        axis.appendResult(ctx, dungeonScroll, encounter.ID, out)
    }(axis)
}
wg.Wait()

The four-step rhythm — wait, execute, log-and-return, emit — is the same in every enricher. The only thing that changes per axis is the agent, the output type, and the append call. That regularity is why this pattern scales across twenty enrichers without becoming a pile of idiosyncratic loops.

Rate limiting by tier

Parallelism without rate limiting overwhelms the provider. Two tiers of limiter match the two tiers of agent — nanos get their own budget, mini-tier reconcilers get another so they don't starve the nanos.

// Two tiers of rate limiter, matching the two tiers of agent.
// Nano  — 25 rps, burst 5   (cheap, parallelizable enrichment)
// Mini  — 10 rps, burst 3   (reasoning, reconciliation)

limiter := pipeline.NewLLMLimiter()
ctx = pipeline.WithLimiter(ctx, limiter)

// Every Execute call is preceded by the appropriate wait.
if err := WaitNano(ctx); err != nil { return err }
_, err := w.Execute(ctx, assessThreat, weave.Context{...}, &out)

// Mini-tier reconcilers use a separate wait to avoid starving the nanos.
if err := WaitMini(ctx); err != nil { return err }
_, err = w.Execute(ctx, reconcileQuest, weave.Context{...}, &verdict)

// Wait functions are no-ops when no limiter is on the context —
// tests don't need to install one.

Today the limiter lives in application code and is injected via context. Lifting it into the weave SDK so every Execute call is automatically gated by its agent's tier is on the roadmap — until then, the WaitNano / WaitMini pattern is the convention.

Why the scroll makes partial failure safe

Each nano's result lands as its own atomic event. A timeout on one axis leaves the other four happy on the scroll; retrying is surgical.

// What partial failure looks like on the scroll.
dungeon:castle-ravenfell
──────────────────────────────────────────
[42]  encounter_threat_assessed     { threat: "major" }         // nano 1 ok
[43]  encounter_terrain_classified  { terrain: "cave" }         // nano 2 ok
[44]  encounter_loot_rolled         { lootTable: "dragon-a" }   // nano 3 ok
   <nano 4 timed out no event appended>
[45]  encounter_ability_added       { ability: "frightful" }    // nano 5 ok

// Retry is surgical re-run nano 4 against the same inputs and emit its
// event. Nothing upstream needs to care. This is what mega-prompt outputs
// cannot do: in a twelve-field blob, one timeout takes out all twelve.

Contrast this with a mega-prompt: one timeout, the entire twelve-field output is lost, and your retry has to redo the eleven axes that already succeeded. The scroll's append-only grain is what makes per-axis retry the natural move instead of a heroic one.

When not to decompose

  • The output fields genuinely co-depend. If axis B's answer must consider what axis A said, a single reasoning call (mini tier) is honest. Don't fake independence by splitting into nanos that then need each other's output to be correct.
  • The axes share a single expensive input. If each nano needs to re-read a 10k-token document, you'll burn the document's token cost N times. Either cache at the prompt layer, or consolidate into one call and accept the coupling.
  • You only have one axis. If the work actually is a single output, one nano is the whole plan. Don't add fan-out machinery for a single call.

What's next