This section is the wiki's instantiation of the vision and mandate: scale Caire output 1000× without scaling humans, by making agent execution the default path for shipping features. Humans write PRDs and approve evidence; agents do everything in between.
Human interfaces
Three tiers, each calling the same backend pipeline. L0 and L1 work today; L2 is the next outstanding item.
L0 — Cursor / Claude Code in a worktree
Use this when you want hands-on control of the diff before the runner takes over.
- Write
wiki/plans/<feature>-YYYY-MM-DD.mdwith PRD frontmatter and a status callout (see SCHEMA.md). - Create a worktree:
./scripts/git/worktree-add.sh <slug> {feat|fix|chore|docs}/<domain>/<slug>. - Open the PRD in Cursor Composer or invoke Claude Code in the worktree. The agent reads the PRD, writes tests, implements, runs the three review subagents, opens a PR.
- Verify before merge by reviewing the PR diff + the dossier folder under
docs/dossiers/<feature>/on GitHub. Approval = a normal GitHub PR review.
L1 — Darwin web at localhost:3010/prd-to-pr (live)
The hands-free path. The dashboard sidebar entry "PRD-to-PR" lists every run; the form at the top kicks one off.
- Write
wiki/plans/<feature>-YYYY-MM-DD.mdand commit tomainofbeta-appcaire. - On the dashboard, paste the PRD path (e.g.
wiki/plans/archive/banner-hello-2026-04-30.md) and click Start run. - The runner walks all 9 stages. The detail page shows live elapsed time, per-stage timing, the agent's stdout log per stage, and an artifact viewer (PRD body, architect's spec, test files, editor's diff, dossier).
- After stage 6 verifies the dossier, the run lands at
awaiting_approval. Click Approve to queuegh pr merge --auto --squash; click Reject to record a reason in the row'snotes.
The runner enforces an auto-kill switch: MAX_CONCURRENT_PRD_RUNS=10 and MAX_RUNS_PER_DAY=30 (override in ~/.config/caire/env or the process env). Submitting past the cap returns HTTP 429.
L2 — Telegram via interface-agent (outstanding)
Described in darwin-as-orchestrator.md. interface-agent routes PRD-style messages to the same POST /api/prd-to-pr endpoint that L1 uses; dossier screenshot posts back to Telegram; reply with "approve" merges. Thin shell over the L1 backend — the routing rule and Telegram handler are the only new code needed.
The nine stages (as built)
The runner's PIPELINE constant in apps/server/src/pipeline/runner.ts walks these in order. Stage names match the dashboard's current_stage column.
0. intake (worktree creation, env replication, yarn install + generate + per-server db:generate)
↓
1. prd (PRD frontmatter + status callout validated)
↓
2. spec (Architect agent — Opus — emits wiki/specs/<feature>.md Gherkin)
↓
3. tests (Test-writer agent — Sonnet — emits failing vitest + Playwright; runner WIP-commits them)
↓
4. implement (Editor agent — Sonnet — iterates with transactional git stash; MAX_ITERATIONS=8)
↓
5. review (resolver-reviewer / dashboard-reviewer / perf-reviewer subagents in parallel)
↓ (P1 finding → re-enter implement; cycle bumps; capped MAX_REVIEW_CYCLES=3)
6. verify (yarn type-check + lint + Playwright; dossier bundler writes docs/dossiers/<feature>/)
↓ (Playwright failure → re-enter implement with verifyHint; same cycle cap)
7. reviewer_loop (local-first verify; gh push; poll Codex/CodeRabbit comments; re-enter implement on P1/P2 or argue down)
↓
8. merge (gh pr merge --auto --squash only after local evidence + current-head required checks; awaits human Approve from dashboard)
Re-entry edges (5→4 and 6→4) bump review_cycle. Hitting the cap fails the run with a surfaced reason; the dashboard's stage timing table records every execution so you see "Implement (2× · 4m 50s)" when iteration looped.
Pages in this section
- vision-and-mandate.md — the north star. The "CTO – AI Systems & Agent Workforce" job description in full. Every other page traces back to one of its four commitments.
- prd-to-pr-pipeline.md — what each stage produces, where artefacts live (
wiki/plans/<feature>-YYYY-MM-DD.md,wiki/specs/<feature>.md,docs/dossiers/<feature>/). - agent-roles-and-model-routing.md — Architect / Test-writer / Editor / Verifier / Reviewer roles, and which model each one runs.
- model-and-vendor-agnosticism.md — vision commitment (b). Routing matrix; rotation cadence; adapter shape.
- spec-as-contract.md — Thoughtworks SDD pattern: PRD compiles to spec; tests are generated from spec; spec is the coordination object.
- verification-and-evidence.md — failure-dossier pattern (Playwright Agents 1.56). The dossier IS the proof artefact attached to the PR.
- reviewer-feedback-loop.md —
gh api .../pulls/<n>/commentspolling; P1/P2 from Codex as failed tests; re-enter Editor. - scale-or-kill.md — vision commitment (c). Auto-promote what works; auto-kill what regresses. Hands-free.
- throughput-and-business-signals.md — vision commitment (d). Features per second per token; revenue/cost/cash as system inputs; mathematician chooses model routing inside the cash budget.
- darwin-as-orchestrator.md — the path to replacing the human CTO with
agent-CTOper the vision. Telegram → Darwin → 4-stage pipeline → PR with screenshot. - skills/humanizer.md — reusable skill for stripping AI tells from public-facing prose. Mandatory for marketing agents; useful anywhere user-visible copy is generated.
- current-gaps.md — single-source-of-truth inventory of remaining outstanding gaps in the agentic workflow (orphan resilience, per-iter telemetry, Telegram bridge, end-to-end validation). Flipped to all-closed by the runner once
wiki/plans/archive/agentic-workflow-gap-closure-2026-04-30.mdmerges. - single-priority-queue.md —
wiki/plans/*.mdis the single queue both engines drink from.lane: nightly | prd-to-prfrontmatter routes a PRD to compound nightly or to PRD-to-PR, with cross-engine dedup so the two never collide on the same PRD.
What this section is NOT
- A how-to for using the existing engineering team day-to-day. For that see agents-engineering.md.
- An advertisement of what we have today. Some of this is built; some is sketched. Each page calls out its
implementation_status. - A model-recommendation guide. Models change; the agnosticism page deliberately routes through adapters so model choice is a config decision, not an architectural one.
Cursor “Build plan” (Composer plan) — editor workflow
When the user opens a phased plan under .cursor/plans/*.plan.md and chooses Build plan (or asks the agent to implement it), treat that file as the source of truth for scope and sequencing, not as automatic approval to rewrite unrelated code.
- Read the plan and the current code — confirm which phase is in scope (stop at explicit phase boundaries unless the user expands scope).
- Create/confirm the spec contract — accepted scope must be represented in
wiki/specs/<feature>.mdwith an Acceptance Contract table (AC-001,AC-002, ...). Cursor/cloud plans are intake only; the spec plusacceptanceCoveragematrix is the done source of truth. - Spec by tests first when the plan calls for behavior — add or extend Vitest (dashboard-server / dashboard) so every non-deferred
AC-*row is reproducible without clicking in the UI. - Implement minimally — follow monorepo resolver/UI rules; regenerate GraphQL types only when schema or
.graphqlfiles change. - Verify —
yarn type-check,yarn lint, targetedvitestfor touched apps, and browser check when the change is UI-visible. - Document — if user-visible behavior or workflow expectations change, update the relevant wiki page or
CLAUDE.mdnote in the same effort.
Manual agent rule: do not say "done" unless the final answer includes a requirement matrix where every accepted requirement is implemented with evidence or deferred-approved with a visible reason.
If the plan’s index or cross-links in wiki/ change materially, run yarn wiki:lint from the repo root.
Cross-references
- Vision and mandate — the canonical north-star reference.
- Engineering agents, optimization agents, management agents, marketing agents — the agents that will populate this pipeline.
- Compound workflow — the existing nightly pipeline that's the closest current approximation.
- Wiki as agent substrate (roadmap) — the work needed to make the wiki agent-friendly enough to drive this pipeline.
- Agentic workflow platform page (PRD) — the public-facing positioning page at
apps/website/public/platform/en/agentic-workflow.htmlthat markets this section to investors, prospects, and technical evaluators. - PRD-to-PR pipeline runner (PRD) — the runner + dossier bundler that drives the eight stages end-to-end. Closes gap-list items #1–#5 from
darwin-component-map.md§c. - The public mirror of this section lives at
caire.se/platform/{en,sv}/agentic-workflow/(one shareable URL per guide). The mirror is regenerated from these files viayarn agentic-pages— seescripts/agentic-workflow-pages/build.ts.