1 pointby Zakaria_Gamal8 hours ago1 comment
  • Zakaria_Gamal8 hours ago
    Hi HN,

    I built StudyWithMiku, an open-source AI study assistant that:

    Ingests PDFs automatically (drop into a folder)

    Embeds them into ChromaDB

    Uses RAG for grounded answers

    Maintains conversational memory with LangGraph

    Outputs synthesized character voice using Coqui TTS + DiffSinger

    Runs locally (Ollama supported) or with cloud LLMs

    The goal was to combine:

    • Retrieval-augmented generation • Stateful agent orchestration • Tool execution (web/system tools) • Persistent memory • Character-specific TTS

    into one cohesive system.

    Engineering challenges included:

    Chunking strategy and retrieval accuracy

    Avoiding recursive agent tool loops

    Managing CUDA/PyTorch/protobuf conflicts

    Keeping latency reasonable with local models + TTS

    It’s Linux-first (Ubuntu recommended) and GPU helps, but CPU works.

    I’m particularly interested in feedback on:

    RAG optimization approaches

    LangGraph/agent architecture patterns

    TTS expressiveness tuning

    Local LLM deployment tradeoffs

    Happy to answer technical questions.