Why a separate service

Scroll and runner are the hot path. Appends fast, reads fast, replays fast. Anything that slows that path is a tax on production. But production is only half the job — the other half is understanding production: diagnosing the incident, replaying a proposed fix, bisecting a prompt change, generating a regression test, sweeping a corpus nightly.

All of that is computationally heavy, LLM-backed, and wants its own indexes, caches, and spawned sandboxes. Scry owns it. Scroll and runner stay lean; Scry scales independently.

Two load-bearing properties

Diagnostic

Testability is conformance.

If a component can be tested via fork + substitute + replay, its inputs, outputs, and dependencies are all expressed as events on a scroll. If it resists those operations, it's leaking state somewhere — hidden globals, wall-clock coupling, a direct DB write, a cross-reactor shortcut. The testing story and the architectural-review story are the same story seen from different angles.

Fractal

Scry operates on Scry.

Scry's agents are regular weave.Agent nodes. Its tools are regular http.Handler subscribers. Its sessions are scrolls. When the diagnoser produces a bad explanation, fork its session, substitute the prompt, replay, diff. If Scry needs a capability weave doesn't expose, that gap is a weave bug — not a Scry deployment detail.

Capabilities

Each capability is a weave application built on the scroll substrate. They compose: conformance finds the anti-pattern, replay verifies the refactor, test-gen captures the regression, bisect minimizes the prompt fix, sandbox validates at corpus scale.

Conformance review

A guardrail agent that reads your .weave files, reactor code, and live scrolls and flags anti-patterns before they harden. Static and dynamic, in-loop with Claude Code.

Fork, substitute, replay

The four primitives everything else is built on. Clone a scroll, replace a recorded event, replay the workflow, diff outcomes — without touching production.

Tests from scrolls

Every scroll is a fixture. Generate a deterministic Go test from an incident scroll. Run corpora at scale. Express invariants as scroll predicates.

Diagnose, bisect, narrate

AI-native debugging on top of scroll-native data. Narrate what happened, explain why replay diverged, bisect the minimum prompt change that fixes a case.

Sandboxed replay

Isolation for when production scrolls can't safely re-touch production systems. Spawn an isolated scroll-server + runner, stub externals, run, tear down.

The canonical use case: porting a live pipeline

The shape to have in mind while reading these docs: a quest-intake pipeline running on the tavern's back wall — thousands of lines of Go, a dozen reactors, a hand-rolled rate limiter, a projection layer glued together with catch-up barriers. It needs to move from in-process weave to the service-based weave ecosystem — networked scroll-server, runner service, distributed consumer groups — without regressing behavior.

Claude Code is the one doing the refactor. Scry is what makes it safe.

Baseline capture. Save the last N production scrolls as a corpus. Conformance review reads the code + .weave files + scrolls and emits a punch list: every projection-read-in-reactor, every payload discriminator, every catch-up barrier. Each finding is annotated with a proposed refactor.
Refactor in-place. Each change — folding where a projection read used to be, splitting a stringly-typed event, declaring a reactor's role — is replayed against the baseline corpus. Corpus replay reports pass / unchanged / diverged with per-case diffs. A divergence without explanation blocks the step.
Graduate patterns into weave. Hand-rolled rate limiters, producer/consumer wrappers, feedback-fold helpers move into framework primitives. The application drops its hand-rolls. Conformance re-verifies that the upgrade didn't change behavior.
Service migration. Scroll becomes networked. Reactors register as consumer-group subscribers. Sandboxed replay runs the same scroll through the in-process and service-based pipelines and diffs. Divergences surface before any cutover traffic.
Regression harvest. Every historical incident becomes a generated test. Every prompt change gets a bisection report. The corpus grows; the regression suite grows with it.

None of the five steps is hypothetical. Each maps to a refactor shape real weave apps already pay for with commit-diff-shaped fixes. Scry is the shape of the tool that would have made those commits smaller, safer, and verifiable.

Operation

Scry is designed to be driven by Claude Code, not a human clicking a GUI. Every capability is reachable from a bash command with --json output. Long-running operations write progressive results to an output directory with a done.json marker. Failed runs are --resume-able. Stale recorded externals hard-fail rather than silently diverge.

The CLI surface looks like:

scry narrate <scroll-id>                            — plain-English summary
scry review <repo> [--scroll <id>]                   — conformance punch list
scry fork <scroll-id> [--at <seq>]                   — clone for experimentation
scry substitute <fork> --event ai.response --value ...
scry replay <scroll-or-fork>                         — re-derive the workflow
scry diff <scroll-a> <scroll-b>                      — per-event divergence
scry test-gen <scroll-id> [--target <reactor>]       — synthesize regression test
scry replay-corpus <dir> --out results/              — batch; resumable
scry bisect --failing <scroll> --prompt <path>       — minimum prompt edit
scry sandbox spawn <corpus>                          — isolated scroll+runner

Status

Scry v0.1 is a library with one agent (narrator) and scroll-first replay in the runner. v0.2 is when fork / substitute / diff RPCs land on scroll-server and Scry becomes a standalone service binary. AI-native capabilities layer on top of v0.2 — each is additive on the same primitives.

Narrator agent

implemented

internal/scry/narrator.go — plain-English summary of any scroll. First Scry v0.1 capability; uses Gemini 2.5 Flash.

Scroll-first replay in syncrunner

implemented

internal/runner/sync/replay.go — runner reads recorded ai.response events before calling the model live. Foundation for fork/substitute/replay.

Provider router + Gemini client

implemented

internal/ai/router.go — longest-prefix model dispatch across OpenAI, Gemini, and future providers. NewClientFromEnv wires credentials automatically.

pkg/scrolltest — deterministic fixtures

sdk-shimmed

Scenario, Seed, AssertScroll primitives exist. Still shaping. No official release yet.

scroll-server fork / substitute RPCs

designed

The networked fork / substitute / overlay / diff endpoints that unlock scry at scale. Designed in docs/scry-architecture.md; not yet wired.

Scry as a standalone service

designed

v0.1 is an in-process library. v0.2+ exposes a Connect RPC surface so Scry is a first-class weave-role binary.