48 pointsby foxfired18 hours ago5 comments
  • zzo38computer16 hours ago
    I also had the idea of zip bomb to confuse badly behaved scrapers (and I have mentioned it before to some other people, although I did not implemented it). However, maybe instead of 0x00, you might use a different byte value.

    I had other ideas too, but I don't know how well some of them will work (they might depend on what bots they are).

    • ycombinatrix8 hours ago
      The different byte values likely won't compress as well as all 0s unless they are a repeating pattern of blocks.

      An alternative might be to use Brotli which has a static dictionary. Maybe that can be used to achieve a high compression ratio.

  • java-man17 hours ago
    I think it's a good idea, but it must be coupled with robots.txt.
    • cratermoon17 hours ago
      AI scraper bots don't respect robots.txt
      • jsheard17 hours ago
        I think that's the point, you'd use robots.txt to direct Googlebot/Bingbot/etc away from countermeasures that could potentially mess up your SEO. If other bots ignore the signpost clearly saying not to enter the tarpit, that's their own stupid fault.
  • 18 hours ago
    undefined
  • codingdave17 hours ago
    Mildly amusing, but it seems like this is thinking that two wrongs make a right, so let us serve malware instead of using a WAF or some other existing solution to the bot problem.
    • theandrewbailey17 hours ago
      WAF isn't the right choice for a lot of people: https://news.ycombinator.com/item?id=43793526
      • codingdave16 hours ago
        At least, not with the default rules. I read that discussion a few days ago and was surprised how few callouts there were that a WAF is just a part of the infrastructure - it is the rules that people are actually complaining about. I think the problem is that so many apps run on AWS and their default WAF rules have some silly content filtering. And their "security baseline" says that you have to use a WAF and include their default rules, so security teams lock down on those rules without any real thought put into whether or not they make sense for any given scenario.
    • chmod77515 hours ago
      Truly one my favorite thought-terminating proverbs.

      "Hurting people is wrong, so you should not defend yourself when attacked."

      "Imprisoning people is wrong, so we should not imprison thieves."

      Also the modern telling of Robin Hood seems to be pretty generally celebrated.

      Two wrongs may not make a right, but often enough a smaller wrong is the best recourse we have to avert a greater wrong.

      The spirit of the proverb is referring to wrongs which are unrelated to one another, especially when using one to excuse another.

    • cratermoon17 hours ago
      • xena17 hours ago
        I did actually try zip bombs at first. They didn't work due to the architecture of how Amazon's scraper works. It just made the requests get retried.
        • cookiengineer9 hours ago
          Did you also try Transfer-Encoding: chunked and things like HTTP smuggling to serve different content to web browser instances than to scrapers?