Centurion – K8s-style resource scheduler for AI coding agents (open source)(github.com)

1 pointby xinhat2 hours ago1 comment

xinhat2 hours ago
I built Centurion because Claude Code has zero cross-session resource awareness. In headless mode (claude -p), each CLI session runs independently with no knowledge of other sessions on the same machine. Spawn five parallel sessions on a 16 GB Mac Mini and you get OOM kills — there's no shared scheduler, no memory backpressure, nothing preventing collective resource exhaustion. Anthropic closed the maxParallelAgents feature request (#15487) as NOT_PLANNED — resource orchestration is outside their scope.
Centurion fills that gap at the OS/infrastructure layer:
- Hardware-aware scheduling: probes CPU cores, RAM, and thermal state before admitting agents - Memory pressure detection: three states (normal/warn/critical) with automatic throttling - Progressive ramp-up: batch size starts at 1, doubles on success, halves on failure - Task DAG orchestration (Harness Loop): decomposes projects into phases with dependency tracking, dispatches parallel batches, handles retries - Real-time events: WebSocket streaming for live agent status (Aquilifer event bus) - Auto-scaling (Optio): monitors queue depth every 10s, scales agents up/down
It uses a Roman military naming convention because why not: Centurion (engine), Legion (deployment group), Century (agent squad), Legionary (individual agent = K8s Pod).
Concrete numbers: - 20+ simultaneous Claude agents on a single 16 GB Mac Mini, zero OOM kills - 8 Rust PRs submitted in 30 min, each passing 7,000+ tests - 8 parallel research tasks completed in 34 min with zero retries
Tech stack: Python 3.12+, FastAPI, SQLite. 382 tests passing. 21 REST endpoints. 19 MCP tools for Claude Code integration. Supports Google's A2A protocol.
Model-independent — the same scheduler works for Claude, GPT, Gemini, or plain shell scripts. It manages processes, not prompts.
MIT licensed: https://spacelobster88.github.io/centurion/