Ask HN: How are teams validating AI-generated tests today?

2 pointsby sriramgonella10 hours ago2 comments

david_iqlabs10 hours ago
One thing I've noticed with AI generated tests is they can look very convincing even when they're wrong. The output reads confidently but there's not always anything grounding it in real signals.
I've found it works better when the AI is just explaining results that come from deterministic metrics rather than inventing the analysis itself.
Curious how other teams are dealing with that.
- sriramgonella10 hours ago
  really good observation. The confidence of the output can sometimes mask the lack of grounding behind it. It almost feels like the emerging pattern is, let AI assist with generation and explanation, but keep the verification layer deterministic and measurable. Curious if you’ve seen teams building internal tooling around that, or if people are mostly relying on existing CI/testing framew
  - david_iqlabs3 hours ago
    It's exactly the experience I had while building iQWEB.
    I spent months trying to make an executive narrative generated by AI, but eventually moved away from that approach. The results were often inconsistent or overly generic, which made it difficult to rely on the output for serious reporting.
    In the end I shifted to a fully model-driven approach where the narrative is built directly from structured signals and scoring logic. That made the reports far more accurate and evidence-based, and it keeps the output consistent from scan to scan.
itigges2210 hours ago
For security vunerability testing on websites I have been making for clients- I almost always hire a senior developer to look over the work and or tests that were created. AI can pass a test, and it can make something that passes a test, but there almost ALWAYS are problems that the senior dev finds with the tests, or with the code that was being tested. Sometimes AI will adjust the code entirely to pass the test or adjust the test to pass failing code.
Another counter-measure I have is to simply lock code before testing. Look over test files, and ensure its not following the happy path.
- sriramgonella9 hours ago
  can we even depend on End to end Testing on this AI Tools? but how far these founders can able to rely on that with confidence. I totaly agree for VAPT it will be better
  - itigges224 hours ago
    In my opinion they cannot confidently rely on it- I think its great to get a product out there, but I think we can draw parallels to things like Dropshipping- where the "Get quick rich" thought process is a major failure. Its better to think about it as that you still need a large amount of energy, time, or capital to achieve a certain level of success. (INCLUDING production coding projects that req security and strong engineering practices). We can use Spec Driven AI Assisted coding to add a level of rigor to "vide coding", but I still don't think models are where they need to be in order to get it perfect. We will almost always have issues.