Every Gemma 4 model in one chat. Try it in 60 seconds and see how we built it(seqpu.com)

2 pointsby fredmendoza3 hours ago1 comment

fredmendoza3 hours ago
We put all 4 Gemma 4 models in one Telegram bot. Text it, send voice memos, send docs, send photos. Switch between the 2B and the 31B (ranked #3 worldwide) mid-conversation with a slash command.
Each model runs its own script on its own hardware. The 2B doesn't burn A100 hours, the 31B doesn't get squeezed onto a tiny card. We keep everything in FP16 for full quality, but if you go INT4 you can run these on CPUs for basically nothing.
The Telegram thread persists your conversation and we force that context into the prompt every call. So when you come back 2 days later it actually remembers who you are and what you were talking about. No vector database, no fancy memory system. Just the chat doing what chat already does.
Hardware spins up when you message, shuts down when done. No idle cost. The 31B costs about a penny per message. The 2B costs basically nothing.
We built this on SeqPU mainly to show how fast you can go from "new model just dropped" to "anyone can text it and try it." Idea to shareable product in 10 minutes. Works with any model, open source or API.
Try it: t.me/OpenGemma4Bot (grab a free key at seqpu.com)
Full writeup: https://seqpu.com/UseGemma4In60Seconds
Our Stabe At Safe Agent Systems: https://seqpu.com/Encapsulated-Agentics