Scry
Scroll remembers. Weave acts. Scry reveals. Scry is weave's maintenance service — the place debugging, testing, iteration, and architecture validation happen. Built on the same scroll primitives any weave app uses, because the testing tool must be forgeable from the same substrate it tests.
Why a separate service
Scroll and runner are the hot path. Appends fast, reads fast, replays fast. Anything that slows that path is a tax on production. But production is only half the job — the other half is understanding production: diagnosing the incident, replaying a proposed fix, bisecting a prompt change, generating a regression test, sweeping a corpus nightly.
All of that is computationally heavy, LLM-backed, and wants its own indexes, caches, and spawned sandboxes. Scry owns it. Scroll and runner stay lean; Scry scales independently.
Two load-bearing properties
Testability is conformance.
If a component can be tested via fork + substitute + replay, its inputs, outputs, and dependencies are all expressed as events on a scroll. If it resists those operations, it's leaking state somewhere — hidden globals, wall-clock coupling, a direct DB write, a cross-reactor shortcut. The testing story and the architectural-review story are the same story seen from different angles.
Scry operates on Scry.
Scry's agents are regular weave.Agent nodes. Its tools are regular http.Handler subscribers. Its sessions
are scrolls. When the diagnoser produces a bad
explanation, fork its session, substitute the prompt,
replay, diff. If Scry needs a capability weave doesn't
expose, that gap is a weave bug — not a Scry
deployment detail.
Capabilities
Each capability is a weave application built on the scroll substrate. They compose: conformance finds the anti-pattern, replay verifies the refactor, test-gen captures the regression, bisect minimizes the prompt fix, sandbox validates at corpus scale.
Conformance review
A guardrail agent that reads your .weave files, reactor code, and live scrolls and flags anti-patterns before they harden. Static and dynamic, in-loop with Claude Code.
Fork, substitute, replay
The four primitives everything else is built on. Clone a scroll, replace a recorded event, replay the workflow, diff outcomes — without touching production.
Tests from scrolls
Every scroll is a fixture. Generate a deterministic Go test from an incident scroll. Run corpora at scale. Express invariants as scroll predicates.
Diagnose, bisect, narrate
AI-native debugging on top of scroll-native data. Narrate what happened, explain why replay diverged, bisect the minimum prompt change that fixes a case.
Sandboxed replay
Isolation for when production scrolls can't safely re-touch production systems. Spawn an isolated scroll-server + runner, stub externals, run, tear down.
The canonical use case: porting a live pipeline
The shape to have in mind while reading these docs: a quest-intake pipeline running on the tavern's back wall — thousands of lines of Go, a dozen reactors, a hand-rolled rate limiter, a projection layer glued together with catch-up barriers. It needs to move from in-process weave to the service-based weave ecosystem — networked scroll-server, runner service, distributed consumer groups — without regressing behavior.
Claude Code is the one doing the refactor. Scry is what makes it safe.
- Baseline capture. Save the last N production scrolls as a corpus. Conformance review reads the code + .weave files + scrolls and emits a punch list: every projection-read-in-reactor, every payload discriminator, every catch-up barrier. Each finding is annotated with a proposed refactor.
- Refactor in-place. Each change — folding where a projection read used to be, splitting a stringly-typed event, declaring a reactor's role — is replayed against the baseline corpus. Corpus replay reports pass / unchanged / diverged with per-case diffs. A divergence without explanation blocks the step.
- Graduate patterns into weave. Hand-rolled rate limiters, producer/consumer wrappers, feedback-fold helpers move into framework primitives. The application drops its hand-rolls. Conformance re-verifies that the upgrade didn't change behavior.
- Service migration. Scroll becomes networked. Reactors register as consumer-group subscribers. Sandboxed replay runs the same scroll through the in-process and service-based pipelines and diffs. Divergences surface before any cutover traffic.
- Regression harvest. Every historical incident becomes a generated test. Every prompt change gets a bisection report. The corpus grows; the regression suite grows with it.
None of the five steps is hypothetical. Each maps to a refactor shape real weave apps already pay for with commit-diff-shaped fixes. Scry is the shape of the tool that would have made those commits smaller, safer, and verifiable.
Operation
Scry is designed to be driven by Claude Code, not a human
clicking a GUI. Every capability is reachable from a bash
command with --json output. Long-running
operations write progressive results to an output
directory with a done.json marker.
Failed runs are --resume-able. Stale recorded
externals hard-fail rather than silently diverge.
The CLI surface looks like:
scry narrate <scroll-id> — plain-English summary
scry review <repo> [--scroll <id>] — conformance punch list
scry fork <scroll-id> [--at <seq>] — clone for experimentation
scry substitute <fork> --event ai.response --value ...
scry replay <scroll-or-fork> — re-derive the workflow
scry diff <scroll-a> <scroll-b> — per-event divergence
scry test-gen <scroll-id> [--target <reactor>] — synthesize regression test
scry replay-corpus <dir> --out results/ — batch; resumable
scry bisect --failing <scroll> --prompt <path> — minimum prompt edit
scry sandbox spawn <corpus> — isolated scroll+runnerStatus
Scry v0.1 is a library with one agent (narrator) and scroll-first replay in the runner. v0.2 is when fork / substitute / diff RPCs land on scroll-server and Scry becomes a standalone service binary. AI-native capabilities layer on top of v0.2 — each is additive on the same primitives.