Show HN: AgentCommander - workflow engine for evolutionary code optimization(github.com)

2 pointsby mx-Liu12318 days ago2 comments

mx-Liu12318 days ago
I built AgentCommander to automate the manual "trial-and-error" loops in my PhD Physics/ML research.
While tools like OpenEvolve (population evolution) and RD-Agent (Kaggle-style automation) exist, I found them difficult to customize for specific, multi-step research workflows. I needed a system that allowed granular control over the agent's decision process—specifically, how it learns from errors and inherits code states.
AgentCommander solves this by providing:
Visual Graph Execution: Workflows are defined as directed graphs, allowing for complex loops, conditional branches, and human-in-the-loop checkpoints.
Evolutionary Tree Tracking: It treats every iteration as a node in a tree. The agent automatically branches off the current "global optimum" rather than a linear history, preventing regression.
Snapshot Integrity: To prevent LLM hallucination or "cheating" (e.g., modifying test cases), the system uses filesystem snapshots to enforce strict read-only permissions on evaluation logic.
Native CLI Wrapper: Built on top of Gemini/Qwen CLI to leverage their native tool-use capabilities while enforcing a sandboxed working directory.
The project is open source (Apache 2.0) and written in Python.
Repo: https://github.com/mx-Liu123/AgentCommander
mx-Liu12318 days ago
Author's Note:
A few technical details for those looking to try AgentCommander:
Why Gemini/Qwen CLI?: I chose these as backends because they offer robust directory isolation. I tried integrating Claude Code, but found it difficult to restrict its file-system reach. Qwen CLI is a great alternative if you want an OpenAI-compatible API with a generous free tier (2,000 requests/day).
Environment: Ensure you have Python 3.10+ and the latest Node.js for the Gemini CLI. If you see Node version warnings, please upgrade to the latest LTS to avoid CLI instability.
Verification: You can audit the agent's "thought process" by running gemini -r inside any generated experiment directory. It’s crucial for verifying that the agent isn't hallucinating its research logic.
I'm currently in Singapore (SGT). I'll stay online for as long as I can to discuss architecture or implementation details, but I'll catch up on all pending questions first thing in the morning!
Repo: https://github.com/mx-Liu123/AgentCommander