Stuff like wrong curl flags, made-up Python APIs, or the same task producing slightly different output every run. After a while, it felt like the obvious fix was: stop asking the LLM to write code.
So in flyto-ai, the LLM doesn’t write scripts. It just: - finds the right module - fills in params
Then the params are validated against the module schema before anything runs. If it’s wrong, it fails early instead of blowing up halfway through execution.
There are 412 prebuilt modules right now (browser, HTTP, files, DB, image, notifications, etc.), so the model is mostly doing selection + parameterization, not improvising code.
Every run also outputs a reusable YAML workflow.
The part I’m experimenting with now is “blueprints”: successful runs get saved, and if a similar task comes in later, the agent can replay the blueprint directly (no LLM call), so you get the same result instantly and basically free.
Install is just:
pip install flyto-ai && flyto-ai
Works with OpenAI / Anthropic / Ollama (local models). Apache-2.0.Curious if other people here have landed on a similar split: LLM decides what to do, deterministic engine decides how it gets done.