26 pointsby Michelangelo112 days ago3 comments
  • nico2 days ago
    This is key

    Besides model capabilities, one of the most important aspects of AI-assisted development right now, is context management

    Cursor et al try to automate that for the user, and it works up to a point. But at a certain level of complexity, the user needs to get actively involved in managing the context

    It also seems like some people who say they are having very good results with agentic coding, take a lot of care in managing their cursor rules or claude.md files

  • marshyj2 days ago
    I do like context engineering better, I also agree that there's a lot that goes into getting good answers out of LLMs and GPT wrapper is a gross oversimplification for many of the products being built on top of them. Just putting good evals in place is often a complicated task.
    • jangletowna day ago
      That's true, we have been trying to help customers doing evals for ages now, and it's super hard for everyone to build a really good dataset and define great quality metrics

      just wanted then to shameless plug this lib I've built recently for this very topic, because it's been much easier to sell that into our clients than evals really, because it's closer to e2e tests: https://github.com/langwatch/scenario

      instead of 100 examples, it's easier for people to think on just the anecdotal example where the problem happens and let AI expand it, or replicate a situation from prod and describe the criteria in simple terms or code

  • jangletowna day ago
    I love the term! But I do think it's both really, after all this time, LLMs are still very finicky, even the order of the instructions still matter a lot, even with the right context, so you are still prompt engineering, ideally this will go away and only context engineering will remain