7 pointsby quarkcarbon27911 hours ago3 comments
  • handfuloflight11 hours ago
    How does this compare to https://github.com/alibaba/page-agent?
    • quarkcarbon27911 hours ago
      PageAgent doesn't have the strong page understanding - semantic tree representation of the pages - it's just a flat DOM basic stripping of HTML - which makes it hard to navigate shadow DOMs, even same origin iframes for that matter or diff frameworks. And also they do element marking - CUA style not sure if they use it in the actual calls to Qwen. And yeah, as arjun takes 30 steps to even do a basic task of find some info.

      What we strengthened building agents working on 2M+ web workflows in the past 4 months - is our representation of pages that seamlessly helps agents go through any page old to new iframes, shadow-DOMs and more. Best part of Rover if you as website owner enable cross-origin reqs, say Doordash has Rover and a merchant be like get my restaurant menu from my website and update in Doordash. Rover agent determines the 3P website need, launches our cloud browser to securely execute 3P site actions gets the menu and updates the merchant menu on Doordash so your users never have to leave your site to do a task - one of a kind enabling cross-site interactions

    • arjunchint11 hours ago
      I actually tried out PageAgent it was reaaaally slow, and not that accurate.

      You can actually try it out on our own site rtrvr.ai

  • sneg5510 hours ago
    [dead]
  • quarkcarbon27911 hours ago
    [dead]