6 pointsby armanified2 hours ago1 comment
  • armanified2 hours ago
    I'm a professional web scraper and have pretty much exhausted most open-source stealth browsers. The paid ones aren't a very economical solution when you want to scrape at scale and have to deal with heavy bot mitigation.

    The issue I repeatedly faced with most tools is that almost all of them run on Linux and try to pretend they aren't, which might work for a while, but given enough samples, it's pretty obvious to large bot-mitigation infrastructures, and they start blocking or at least push a bit harder.

    So I tried to build a browser myself, which does a few things:

    * Randomize profiling based on the given seed * Match the timezone based on the proxy exit IP * Can outsource Canvas to a different machine ( from the target OS )

    This canvas outsourcing is what really makes it different, because pretending isn't enough. Some of the toughest bot-mitigation infrastructures probe deep into the canvas, a depth that a simple patch can't bypass on a different OS. But this is optional and should only be done if everything else fails.

    Right now, this is an early release. Any feedback is highly appreciated