toliveistobuild2 hours ago
The most telling number here isn't who's #1 - it's the spread. 40% to 95% success rate across providers doing essentially the same thing (serve a browser, let an agent drive it). That's a massive gap for infrastructure that's supposed to be commoditized.
The scalability test is where it gets real. 250 concurrent sessions and most providers weren't even tested because they couldn't handle it. BrowserAI at 86% vs ZenRows at 51% under load tells you everything about who actually built for multi-tenant agent workloads vs who wrapped a Playwright container in an API.
What's missing from this benchmark is the thing that actually kills you in production: anti-bot detection. A remote browser that loads pages fast but gets Cloudflare-blocked on every third request isn't useful. The "features" score tries to capture this but lumping CAPTCHA solving, proxy rotation, and session persistence into one number obscures the real failure modes. The other elephant in the room: none of these benchmarks test authenticated sessions - the agent logged into your actual accounts doing real workflows. That's where the security/reliability tradeoff gets genuinely hard and where most of these providers have zero story.