1 pointby jruohonen8 hours ago2 comments
  • jruohonen8 hours ago
    "Diagnostic categorization reveals three distinct failure modes: pure hallucinations (5.1%), hallucinated identifiers with valid titles (16.4%), and parsing-induced matching failures (78.5%)."
  • 8 hours ago
    undefined