2 pointsby astressence7 hours ago2 comments
  • bnovikov7 hours ago
    Interesting direction. Such projects feels like a natural evolution from text first systems toward more human facing interfaces. People want the speed, privacy, and reliability of LLMs together with the responsiveness of a real personal assistant :) I am curious what made you personally prefer talking to a face instead of text. What actually changed for you in practice?
    • astressence6 hours ago
      Good question. Honestly it just made me actually stick with voice instead of going back to typing.

      Before Mimora I'd talk to the agent and then stare at a Discord text channel waiting for a response. No feedback loop at all. You say something and then... silence until text appears. Felt like talking into a void so I'd default back to the keyboard every time.

      With the avatar there's a "listening" state when I'm speaking, a "thinking" animation while it processes, and expressions when it responds. It became this permanent little spot on my screen where I can glance and immediately see what the agent is up to. That alone was enough to make voice feel like an actual conversation instead of a command line with extra steps.

      I've been doing game dev for years so building a 3D character with expressions was second nature. Made it easy to prototype fast and figure out where the real value was. Turns out it wasn't about making it look cool, it was just about closing that feedback gap between you and the agent.

      • bnovikov3 hours ago
        It’s also interesting from the perspective that we write and speak differently, and there’s an emotional connection that forms when we assign human traits to an agent. That changes the way we interact with it and the kinds of use cases it enables. Sounds cool. Good luck with your project!
  • Ciaranio7 hours ago
    This is great. What's your token usage looking like? Costs?
    • astressence7 hours ago
      I'm on Claude Max 20x ($200/mo) which gives me plenty of headroom, but you definitely don't need that. A 5x plan works fine, or you can go the API route through OpenRouter and pay per token. The voice layer (Parakeet + Kokoro) runs locally on the M4 so that's zero cost. The key is good memory practices so the agent doesn't waste tokens re-learning things, and falling back to cheaper models (like Haiku or Gemini Flash) for simple tasks. The heavy model only kicks in when it actually needs to think.