6 pointsby grimm80806 hours ago4 comments
  • Unical-A6 hours ago
    Interesting approach. How are you handling the DOM processing inside the sandbox without spiking CPU usage? If it's not making constant API calls, is the vision model running locally (WASM/WebGPU), or are you using a clever way to diff the page state before sending it to the LLM?
    • grimm80806 hours ago
      Yes, it is kind of a clever way to diff the page state. The ai agent has some built in tools to deal with this. The user gives a prompt say "Watch for X words" The LLM then runs the provided tool with the necessary args. The tool then runs a python loop to check for it in the DOM while the LLM sleeps. Then once it's found the LLM is awoken. Also, there's a tool for watching for changes in pixels in certain regions. It works in a similar way.
    • 6 hours ago
      undefined
  • 5 hours ago
    undefined
  • Remi_Etien4 hours ago
    [dead]
  • evermore6114 hours ago
    [dead]