claude-autopilot is an MIT-licensed npm package that runs Claude Code through an autonomous
pipeline: brainstorm, spec, plan, implement, migrate, validate, PR, review, bugbot. Point it at
an idea, walk away, come back to a PR that's review-ready. Merge stays human-gated by default.
Try it in 30 seconds:
npm install -g @delegance/claude-autopilot
claude-autopilot examples # list 5 starter stacks
claude-autopilot examples node > spec.md
claude-autopilot autopilot spec.md # ship it
Five bundled stack templates (node, python, fastapi, go-cli, rust-cli) so you don't write your
first spec from a blank page.
The strongest credibility signal I can give you: claude-autopilot built itself. Every version of
this project that ever shipped, including v7.10.1 today, went through the pipeline you'll see on
GitHub. Spec, plan, implementation subagents, Codex review, bugbot triage, admin-merge, npm
publish. Full commit history and review threads preserved on the repo. No marketing, just the
receipts.
I also use it daily on a production codebase. Several hundred thousand lines of code merged per
week sustained, with one week peaking over a million. That's gross churn across feature code,
tests, types, and migrations, mostly via the autopilot pipeline. The CLI is solving real problems
for me before it ships to anyone else.
What's actually distinctive:
1. Multi-model role split, by default. Claude writes code, Codex reviews the plan and the diff,
Cursor bugbot triages PR findings. Each model gets the job it's actually best at. Sequential by
default. Opt-in parallel council (claude-autopilot council) dispatches the same prompt to Claude
+ Codex + Gemini and synthesizes consensus.
2. Every phase is an editable markdown skill. Not a black-box pipeline.
.claude/skills/autopilot/SKILL.md is plain markdown you can read in 5 minutes, audit, edit, swap
any phase. The risk-tiered review policy (1/2/3 Codex passes by spec risk frontmatter,
auto-escalated for auth, multi-tenancy, billing, secrets, migrations, RLS, IAM) lives there as
plain instructions. Inspectability is the wedge against Devin and Cursor agent mode.
3. Local CLI, your provider keys. Anthropic, OpenAI, Google, Groq, Ollama-local. The
orchestration runs on your machine. Prompts go to whichever models you've configured. For pure
local-only you need Claude Code itself on a local provider; for most teams the goal is "no hosted
orchestration plus existing keys."
Benchmark on a Next.js fixture seeded with 13 production-realistic bugs (SQL injection, missing
auth, IDOR, SSRF, open redirect, TOCTOU race, console.log in prod, missing input validation,
etc): scan caught 13/13 in 38 seconds for $0.21. Fixture and reproduction in the repo.
Links:
https://www.npmjs.com/package/@delegance/claude-autopilot
https://github.com/axledbetter/claude-autopilot
I'm Alex, founding eng at Delegance (insurance brokerage platform). Built claude-autopilot for my
own internal use, open-sourced when it started shipping itself.