I subscribe to ChatGPT Pro, but Ralph + Fast Mode can still burn through tokens. My company has a sizable Cloudflare credit from the Startups program (https://www.cloudflare.com/forstartups/) that we're barely using. Workers AI has some interesting models like Kimi K2.5, Gemma 4, GLM-4.7-Flash, GPT-OSS-120B, so I wanted to route Codex through them.
Tried OpenCode's Cloudflare provider first but Codex's UX fits me better and the integration felt incomplete. Workers AI has an OpenAI-compatible API so I expected it to just work with Codex. Nope. The Responses API surface doesn't map cleanly. A local proxy was the simplest fix so I had Codex write one.
It's a small TypeScript proxy that translates between Codex CLI and Workers AI. Clone, add credentials, npm start, done. There's a shell wrapper (codexcf) that handles daemon management and model catalog automatically.
Been running Kimi K2.5 for a day, works well enough for my workflow. Gemma 4 looks promising too, and Cloudflare told me Qwen 3.5 is coming soon. GLM 5.1 probably not far behind.
If you have Cloudflare credits collecting dust, this might be a good way to spend them.