3 pointsby wizeyone4 hours ago1 comment
  • wizeyone4 hours ago
    Author here - we're the team behind Wizey, one of the two AIs in the comparison. A few things up front:

    * Methodology was fixed before the runs.

    * All outputs are quoted verbatim, including Case 2 (MGUS) where ChatGPT beat us cleanly.

    * Panels are reconstructed from published case reports (Blood, Annals of Family Medicine, and others), so anyone can reproduce the experiment on Claude, Gemini, or Grok.

    Full verbatim outputs for all five cases: https://wizey.one/blog/2026/04/17/wizey-vs-chatgpt-raw-exper...

    Happy to answer anything on methodology or individual cases.