The LLM-calling node

Every node in weave is either driven by events (a reactor) or driven by a single call that returns a structured result (an agent). An agent wraps the mechanics of the model round trip: build the messages, send the request, commit the response, decode into a typed struct. That whole loop is an ai.request paired with an ai.response on the scroll.

The output is not a free-form string. Every agent declares an output schema via WithOutput, and the runtime constrains the model to produce JSON that fits it. This is what makes agents composable — downstream reactors and projections can treat the result as data without a parsing step.

// An agent is a node that calls a model with a typed output contract.
type RumorTags struct {
    Tags []string `json:"tags"`
}

var rumorTagger = weave.Agent("rumor-tagger",
    weave.WithSystem(`Classify this tavern rumor into 1-3 short tags (e.g. "dragon-sighting", "missing-heir", "haunted-ruin").`),
    weave.WithPrompt(func(ctx weave.Context) string {
        return "Rumor:\n\n" + ctx["rumor"].(string)
    }),
    weave.WithOutput(RumorTags{}),
)

// Register with the weave engine alongside reactors and other agents.
if err := w.Register(rumorTagger); err != nil { return err }

Anatomy

An agent has five moving parts. Only the name is required; everything else is a sensible default you override when you mean to.

Name. A stable identifier. Used in the scroll, in traces, and to look up the agent in the registry.
System prompt. The model's instructions. Usually loaded from a .tmpl file at init.
User prompt. A string with {{key}} interpolation, or a func(Context) string for dynamic rendering.
Output schema. A zero-value struct with json tags. The runtime derives a JSON schema and constrains the model to produce matching output.
Tools. Zero or more. Zero means a single-shot call; one or more turns the agent into a tool loop.
Model. Optional per-agent override via WithModel. Empty falls back to the Weave-instance default.

Decomposition is the load-bearing move

The instinct to build one big agent with a rich output struct is the instinct to resist. A 12-field output is a 12-axis prompt, and all 12 axes fail together, cache together, and are opaque to debug together. The production pattern that scales is to decompose into the smallest useful unit and fan out.

Nano

gpt-5-nano

One structured output. Scoring, classification, single-enum generation, one-sentence synthesis.

Rule of thumb. Default for enrichment. Output struct has 1–3 fields. If you need more, you need more nanos.

Mini

gpt-5-mini

Multi-step reasoning inside a bounded domain. Reconcilers, extractors, categorizers.

Rule of thumb. Opt in with WithModel. Output still a typed verdict — reasoning lives in the chain of thought, not the schema.

Full

latest frontier

Only for the one conversational agent that holds the user loop. Multi-turn, tool-calling, open-ended.

Rule of thumb. There is usually exactly one of these per product. Everything else should decompose.

// Three tiers — one output per agent is the load-bearing rule.

// Nano — single structured output. The default for enrichment and scoring.
type ThreatResult struct {
    Threat string `json:"threat" jsonschema:"description=trivial|minor|major|deadly"`
}
var assessThreat = weave.Agent("assess-monster-threat",
    weave.WithSystem(threatPrompt),
    weave.WithPrompt(renderEncounter),
    weave.WithOutput(ThreatResult{}),
    weave.WithModel(GPT5Nano),        // or inherit from w.SetDefaultModel
)

// Mini — multi-step reasoning inside a bounded domain (reconcilers, extractors).
type Verdict struct {
    Kind    string `json:"kind"`   // accept | reject | duplicate | supersede
    Reason  string `json:"reason"`
}
var reconcileQuest = weave.Agent("reconcile-quest",
    weave.WithSystem(dungeonMasterPrompt),
    weave.WithPrompt(renderProposalAndAccepted),
    weave.WithOutput(Verdict{}),
    weave.WithModel(GPT5Mini),
)

// Full — reserved for the one conversational agent that holds the user loop.
// Tool-calling, multi-turn, open-ended.
var innkeeper = weave.Agent("innkeeper",
    weave.WithSystem(innkeeperSystemPrompt),
    weave.WithPrompt(func(ctx weave.Context) string { return innkeeperPrompt }),
    weave.WithTool(proposeQuest),
    weave.WithTool(proposeReward),
    weave.WithModel(GPT5Full),
)

The tiers are a convention, not a runtime distinction. They are useful because nano is cheap, parallelizable, and cacheable in a way that mega-agents never are. Ship nanos by default; reach for mini only when a single output needs real reasoning; reserve full for the one conversation that genuinely needs it.

Executing — inputs go in, typed output comes out

An agent does nothing until w.Execute is called. You pass an inputs bag and a pointer to the output struct; the runtime hydrates the prompt, calls the model, decodes into the pointer, and records everything on a scroll.

// Executing an agent — pass an inputs bag and a pointer to the output struct.
var result RumorTags
_, err := w.Execute(ctx, rumorTagger,
    weave.Context{"rumor": rumor},
    &result,
)
if err != nil { return err }
// result.Tags is now populated, the call landed on the scroll, and
// replay will skip the network round-trip if the ai.response is recorded.

The scroll is the unit of replay. Execute the same agent with the same inputs against a scroll that already has the ai.response and the runtime returns the recorded answer — no network call, deterministic output, identical test fixture every run.

Fan-out, not mega-prompts

The fan-out pattern is the counterpart to decomposition. Inside a reactor, launch N nanos in parallel with a sync.WaitGroup, gate each one on a rate limiter, and log-and-continue on per-call failure. One axis timing out shouldn't sink the other eleven.

// Fan-out: four nanos in parallel, one atomic event per result.
var wg sync.WaitGroup
for _, axis := range encounterAxes {    // threat, terrain, loot, xp
    wg.Add(1)
    go func(axis enrichAxis) {
        defer wg.Done()

        if err := WaitNano(ctx); err != nil {  // rate-limit against the nano tier
            return
        }

        var out axis.resultType
        _, err := w.Execute(ctx, axis.agent,
            weave.Context{"encounter": encounter},
            &out,
        )
        if err != nil {
            slog.Warn("nano failed", "agent", axis.agent.Name(), "err", err)
            return  // one axis failing does not block the siblings
        }

        axis.appendResult(ctx, dungeonScroll, encounter.ID, out)
    }(axis)
}
wg.Wait()

Each result lands as its own atomic event on the domain scroll. Retrying a single failed axis is a one-line re-enqueue, not a re-run of the whole mega-prompt. This is the operational payoff of nano decomposition.

Tool loops, briefly

When an agent has WithTool attached, its execution becomes a loop rather than a single shot. The model may emit a tool call; the runtime commits a tool.dispatch, the handler commits a tool.result, and the loop continues until the model returns a final response. The agent still has a typed output contract — the loop ends when the output fits.

Tool loops are the one place where full-tier models earn their slot. Everything else — enrichment, classification, extraction, reconciliation — should be decomposed into single-output nanos.

Not to be confused with

A chain. A chain implies a linear sequence of model calls inside one abstraction. An agent is one call (or one tool loop). You compose agents with reactors and folds — the composition layer is the scroll, not an in-memory pipeline object.
A prompt template. A prompt template is a string. An agent is a name plus a contract plus a call. The template is a detail of WithPrompt; the rest of the agent is what makes the call meaningful.
An autonomous worker. An agent is a pure function of its inputs — one call returns one typed output. Autonomy comes from reactors subscribing to scrolls and driving agents over time, not from agents running themselves.

Status

The agent primitive is shipped and exercised in production (tavern, examples). The rate-limiter tier plumbing and name-indexed Execute are the remaining edges that surface once a pipeline has many agents.

weave.Agent(name, opts...) *Node

implemented

Single constructor. Returns a *Node that registers and executes the same way as reactors.

WithSystem / WithPrompt / WithOutput

implemented

System prompt, user prompt (string with {{key}} or func(Context) string), structured output schema from a zero-value struct with json tags.

WithTool — tool loop

implemented

Attach one or more tools and the agent becomes a multi-turn tool loop — dispatch + result events round-trip on the scroll until the model emits a final response.

WithModel — per-agent override

implemented

Pins a model for this agent; empty falls back to the Weave default set by SetDefaultModel. This is the tier selector.

w.Execute(ctx, agent, inputs, &out)

implemented

Runs the agent once on a fresh scroll. Hydrates inputs into the prompt, calls the model, decodes the output into the provided pointer, and persists the run.

Replay of ai.response

implemented

If the scroll already contains an ai.response for the step, the runtime skips the network call and re-uses it. This is the base case of replay.

Rate-limit tiers (WaitNano / WaitMini)

sdk-shimmed

Used in production pipelines today; lives in application code. Lifting the limiter into the weave SDK so every Execute call is gated by tier is next.

w.Execute by name (string) instead of *Node

designed

Pass an agent name string to Execute instead of the *Node pointer. A name-indexed lookup on the registry lets handlers reference agents without importing them.