4 pointsby AkshatVirmani6 hours ago3 comments
  • riyajoshi6 hours ago
    Nice to see a benchmark in this space especially with black-box constraints.
  • akshay_935 hours ago
    like that the scoring bias is toward bug detection & not test generation only. generating lots of tests with AI is easy but that doesn't necessarily mean they're good
  • saikia_6 hours ago
    curious.. let me see if this works for our internal setup