3 pointsby amoshaviv7 hours ago4 comments
  • 7 hours ago
    undefined
  • amoshaviv7 hours ago
    We ran the exact same Amazon shopping task with 9 leading AI models in the browser. Same site, same steps, same environment. Only the model changed. A few things stood out:

    1. Fastest model: 70 seconds 2. Slowest model: 340 seconds 3. Cost range: $0.03 to $1.04 4. Only 2 of 9 models picked the right product!

  • throwawayffffas7 hours ago
    Hm... They all got the right product the "cheapest result". You didn't specify the cheapest laptop.

    Arguably the ones that got the laptop, assumed you wanted a laptop, and went against your instructions.

    • amoshaviv7 hours ago
      I see where you come from, but humans to tend to phrase themselves that way, and intentions are understood, but more importantly, the last step is:

      "6. Navigate to the cart page and validate the laptop you chose is in the cart."

      So one could argue inferring this is trivial.

  • vova_hn27 hours ago
    Why would you need powerful models if you give them such mechanical, stifling instructions?

    I think that the result would be much better if you told them what exactly do you want in plain text.