I came to this idea first when I had to scrape Amazon listings dynamically for a project. With Amazon’s notoriously aggressive enforcement of anti-bot and anti-scraping policies, I knew I had to get creative with automating this. So, I came to the realization that each page could be dynamically navigated to and loaded with Selenium and then be saved to a static `.html` file. Then, Claude could simultaneously read these files and update its actions based on the contents.
A few days later, I had a new project where I had to review applications from an admin portal. This was when the core idea struck me like déjà vu. I refined the system to use my default browser portfolio, automatically including saved cookies and session data. Then I designed a simple command system for the AI to interact with the browser. The result was a bare-bones yet fully functional system that took the best of both AI coding agents and browser automation tools.
If you would like to check out the project, the GitHub repo is https://github.com/GregoryLi360/Agentic-Browser-Automation/. The main `browse.py` file is a mere ~250 lines of code. Any feedback is appreciated!