Agentic Workflow

PRD-to-PR pipeline

Eight stages, each with named artefacts. From a one-page brief to a merged pull request with a screenshot.

The default shipping path: human writes a PRD, agents do the rest, human approves a screenshot. This page describes each stage's input, output, gate, and storage location.

Pipeline overview

The runner persists every transition into prd_to_pr_runs (SQLite at .compound-state/agent-service.db) so a process restart can read the row's current_stage and resume — except for spawned claude -p children, which are killed on restart and do not auto-resume (see "Outstanding" callout above; fix is detached: true in spawn options).

Stage 0 — Intake normalization

The orchestrator may receive a rough chat request, a Cursor plan under .cursor/plans/, or an already-written wiki PRD. Before spawning implementation agents it normalizes the input into a portable PRD:

Stage 0 is allowed to ask a blocking question only when the requested product behavior is ambiguous enough to change the spec. It should not ask for permission to create the worktree.

Stage 1 — PRD

A wiki/plans/<feature>-YYYY-MM-DD.md file. Frontmatter must declare implementation_status: planned (or similar) and the body must open with a visible status callout per SCHEMA.md.

A good PRD answers:

The PRD is the only place a human is required. Everything downstream consumes it.

Stage 2 — Spec

The Architect agent reads the PRD and emits wiki/specs/<feature>.md. The spec is machine-checkable Gherkin-style acceptance:

## Acceptance Contract

| ID     | Requirement                                        | Status          | Evidence | Notes |
| ------ | -------------------------------------------------- | --------------- | -------- | ----- |
| AC-001 | Best trial is shown above the leaderboard.         | not-implemented |          |       |
| AC-002 | Promote button is disabled for in-progress trials. | not-implemented |          |       |
Feature: Lab promotion banner
  Scenario: AC-001 Best trial is shown above the leaderboard
    Given a service area with at least one completed lab trial
    And one trial is pinned in ServiceAreaBestLabScenario
    When the user opens /admin/optimization-lab/<id>
    Then the BestTrialBanner is visible
    And it shows the pinned trial's metrics

  Scenario: AC-002 Promote button is disabled for in-progress trials
    Given a trial whose solution.status is solving_active
    Then the Promote button is disabled
    And hovering shows "Trial still solving"

The spec is the coordination object between agents. It bounds what counts as done. Every non-deferred AC-* row must appear in at least one generated test file.

Stage 3 — Test stubs

The Test-writer agent reads the spec and writes failing tests:

The pre-commit assertion: every test stub must run and fail with an informative error (typically "X is not implemented"). This catches accidentally tautological tests before the editor stage starts.

The acceptance assertion: every non-deferred AC-* row from the spec must be mapped to at least one generated test. If the test-writer drops an accepted requirement, stage 3 fails before implementation starts.

Stage 4 — Implementation

The Editor agent (cheaper Sonnet model) iterates on the diff until the test stubs pass. Constraints:

The editor is allowed to fail. If after MAX_ITERATIONS the spec's tests aren't green, it surfaces the failure to the orchestrator instead of grinding forever.

Stage 5 — Self-review

The orchestrator runs the appropriate subagents in parallel:

Findings are categorised P1 (block) / P2 (must address) / P3 (acknowledge). The Editor re-enters with the findings as an additional input.

Stage 6 — Verification

The Verifier agent runs:

This stage cannot be short-circuited. Without a dossier and complete acceptanceCoverage, the PR cannot merge. See verification-and-evidence.md.

Stage 7 — Reviewer loop

After git push, the Reviewer-feedback agent polls gh api repos/{org}/{repo}/pulls/<n>/comments until Codex / CodeRabbit have completed their review. Any unaddressed P1 or P2 comment re-enters the Editor stage. This codifies the standing memory rule on Codex review polling.

Stage 8 — Merge + evidence

When (a) CI is green, (b) the dossier is present, (c) acceptanceCoverage.complete=true, and (d) the reviewer loop is settled, the orchestrator queues the PR with gh pr merge --auto --squash against the merge group. After merge, the screenshot from the dossier is sent to the human channel (Telegram via interface-agent).

The human's job at this point is to look at the screenshot and acceptance matrix, then either approve or reject. A run with missing accepted-scope evidence is incomplete_scope, not done.

Storage layout

<repo>/
├── wiki/plans/<feature>-YYYY-MM-DD.md      ← PRD (human-authored)
├── wiki/specs/<feature>.md                 ← Architect output (Gherkin)
├── docs/dossiers/<feature>/                ← Verifier output
│   ├── trace.zip
│   ├── screenshot.png
│   ├── console.log
│   └── summary.json                        ← Machine-readable test outcomes + acceptanceCoverage
└── .compound-state/agent-service.db        ← Throughput + cost metrics per stage

Everything except the dossier is committed to the feature branch. The dossier is committed for auditability and to make rebuild deterministic.

Cross-references