1 pointby tabmate3 hours ago3 comments
  • freakynit2 hours ago
    1. Use context compression proxies/tools. These can compress anywhere from 5% to over 95% depending on the workload. Search github for these.

    2. Use bigger models for creating detailed plan, and then use smaller cheap models, like deepseek-v4-flash, or gemini-3-flash for actual implementation. This works really well.

    3. Do not just keep chatting in the same session. Try to start a new session for evrry new chat message, or at most like after 2-3 chat messages. if needed, you can ask it summarize the details and use that as context for new chat session.

    4. Implement the features in small sets, not one go.. and reset session after every set is done.

    5. Keep AGENTS.md small, just the basic info about your project, and the file paths and what that file contains and do and then, general guideleines (10 max).

  • pitched3 hours ago
    Switch to codex! That will extend sessions 4x at least.

    Outside of that, the key is to keep context as low as possible. The cost of a token increases a lot as context grows. My current favourite approach to that is RPI. Run the Research, Plan, and Implement phases each in their own isolated agent that produces a markdown file for the next.

  • imthinking3 hours ago
    [dead]