Hi — author here.
One clarification:
The goal is not to let an AI freely control a computer.
I built a fixed local action skill library.
Each skill is a deterministic OS operation (open app, switch window, run command, structured input).
The model does not generate UI steps or mouse actions.
It only selects a skill.
The gateway executes it.
So the LLM is making decisions, not performing motor control.
The computer isn’t remotely driven by the model —
the model chooses from a constrained set of allowed actions.
This is mainly an experiment in making computer-using agents more predictable and auditable.
I’d especially value thoughts from people working on agent safety.