3 pointsby chopratejas20 days ago4 comments
  • chopratejas20 days ago
    Some results from real world data so far:

      ┌─────────────────┬─────────────┬──────────────────────────────┐
      │    Data Type    │ Compression │             Why              │
      ├─────────────────┼─────────────┼──────────────────────────────┤
      │ Server logs     │ 90%+        │ Highly repetitive patterns   │
      ├─────────────────┼─────────────┼──────────────────────────────┤
      │ MCP tool output │ 70%+        │ JSON structure overhead      │
      ├─────────────────┼─────────────┼──────────────────────────────┤
      │ Database rows   │ 50-70%      │ Same schema, many records    │
      ├─────────────────┼─────────────┼──────────────────────────────┤
      │ File trees      │ 40-50%      │ Repeated metadata            │
      ├─────────────────┼─────────────┼──────────────────────────────┤
      │ Code diffs      │ 0%          │ Every line unique            │
      ├─────────────────┼─────────────┼──────────────────────────────┤
      │ Dense prose     │ -0.3%       │ No patterns, slight overhead │
      ├─────────────────┼─────────────┼──────────────────────────────┤
      │ Encrypted       │ 0%          │ Incompressible               │
      └─────────────────┴─────────────┴──────────────────────────────┘
  • chopratejas20 days ago
    What is it?

    - Context Compression (with Reversibility - this part is the difference) for LLMs

    - very different than any compression or summarization tools that promise cost savings and speed!

    - claude code costs / cursor costs - reduced by 50-60%

    - ideal for startups and Enterprises!!

    - integration with LangChain

    - Memory as a first class citizen

    - its OSS! So Free!

    Give it a try, Its OSS - if you love it, star it. If you don't, lets make it better, together!

  • goeb117 days ago
    Seems very useful. I tried it on my Claude code and it was saving approximately 50% Do you know how I can push it to save more? Do you also have plans to make it Enterprise ready?
  • niux20 days ago
    Not a single example of what it does or how it works
    • chopratejas20 days ago
      Fair enough. Trying to keep it concise here - This is how you install it:

      pip install "headroom-ai[proxy]"

      headroom proxy --port 8787

      It will:

      * Check all the data going into the LLM and apply intelligent compression based on the content type - different for JSONs, code etc.

      * If the LLM is not getting what it is seeking, there is reversible compression - so the LLM will not lose accuracy

      * When you think of MCP tools, code function calls etc. that fill up the context window and cause needle in haystack problems - they get eliminated.

      There is also an SDK which works like this:

      from langchain_openai import ChatOpenAI from headroom.integrations import HeadroomChatModel

      # Wrap your model - that's it!

      llm = HeadroomChatModel(ChatOpenAI(model="gpt-4o"))

      # Use exactly like before response = llm.invoke("Hello!")

      Ive personally used it with Claude Code and Cursor and seen the benefits.