5 pointsby GustavHartz7 hours ago1 comment
  • GustavHartz7 hours ago
    This started as a response to the recent "you have to post-train a model to pen-test" Show HN — we don't think you need to, just makes life a bit easier.

    Across 10K+ of our agent transcripts from benchmarking against OpenAI's EVMBench, we saw zero refusals. In the closed-frontier models, the refusal you hit is mostly a separate content classifier, or a system prompt, not so much the model itself. Breadth (more cheap agents) beats a bigger model, but it puts more requirements on context engineering

    https://news.ycombinator.com/item?id=48609231