1 pointby Pouryaak2 hours ago1 comment
  • Pouryaak2 hours ago
    Hey HN,

    While building a browser-native privacy AI (Lokul), I hit a massive wall trying to manage memory for local LLMs. I realized that just dumping chat history into IndexedDB and blindly feeding it back into the prompt quickly blows up context windows and leads to heavy hallucinations.

    I was spending 80% of my time babysitting context limits and writing custom boilerplate to prune and inject history. I got tired of it, so I ripped the memory architecture out, rewrote it, and open-sourced it.

    LokulMem handles:

    Smart Collection: Categorizing conversational turns vs. system states.

    Strategic Context Insertion: Dynamically fetching only the most relevant context so you don't exceed token limits.

    Local-first storage: Keeping everything completely in-browser for privacy.

    I wrote a deeper dive into the architecture and the headache of goldfish-memory LLMs on my blog here: [Insert your Hashnode URL here]

    I’d love your feedback on the codebase, the approach, or to just hear your war stories on how you're currently handling local LLM context limits.