60 pointsby birdculturea day ago5 comments
  • p0w3n3d19 hours ago
    Normally people get punished for downloading illegal books. Allegedly someone at meta downloaded hella ton of illegal books and taught the LLM on them and they said "oh it was for his/hers private usage". You won't get justice here
    • muldvarp19 hours ago
      This to me is the most ridiculous thing about the whole AI situation. Piracy is now apparently just okay as long as you do it on an industrial scale and with the expressed intention of hurting the economic prospects of the authors of the pirated work.

      Seems completely ridiculous when compared to the trouble I was in that one time I pirated a single book that I was unable to purchase.

      • Llamamoe19 hours ago
        We've essentially given up on pretending that corporations are also held accountable for their crimes in the recent years, and I think that's more worrying than anything.
      • p0w3n3d11 hours ago
        Recently archive.org got into trouble for renting one book (or fixed amount of books) exclusively on the whole world, like in a library. Sad men from law office came and made an example of them, but it seems that if they used those books to teach AI and serve the content in "remembered" way, they would get away with it.
      • Mathnerd31416 hours ago
        Well, so what the actual ruling was was that use of the books was okay, but only if they were legally obtained. And so the authors could proceed with a lawsuit for illegally downloading the books. But then presumably compensation for torrenting the books was included as part of the out of court settlement. So the lesson is something like AI is fine, but torrenting books is still not acceptable, m'kay wink wink.
      • lifestyleguru19 hours ago
        Hollywood and media publishers run entire franchises of legal bullies across developed world to harass individuals, and lobby for laws allowing easy prosecution of ISP contract owner. Even Google Books was castrated because of IP rights. Now I have hard time to imagine how this IP+AI cartel operates. Nowadays everyone and their cat throws millions on AI so I imagine IP owners get their share.
    • 18 hours ago
      undefined
  • 1gn1514 hours ago
    This article commits several common and disappointing fallacies:

    1. Open weight models exist, guys.

    2. It assumes that copyright is stripped when doing essentially Img2Img on code. That's not true. (Also, copyright != attribution.)

    3. It assumes that AI is "just rearranging code". That's not true. Speaking about provenance in learning is as nonsensical as asking one to credit the creators of the English alphabet. There's a reason why literally every single copyright-based lawsuit against machine learning has failed so far, around the world.

    4. It assumes that the reduction in posts on StackOverflow is due to people no longer wanting to contribute. That's likely not true. Its just that most questions were "homework questions" that didn't really warrant a volunteer's time.

    • bicepjai2 hours ago
      I love the LLM tech and use them everyday for coding. I don’t like calling them AI. We can definitely argue LLMs are not just rearranging code. But let’s look at some evidence that shows otherwise. Last year NYT lawsuit that show llms has memorized most of the news text, you had see those examples. Recent not-yet peer reviewed academic paper “Language Models are Injective and Hence Invertible “ shows llms just memorized training data. Also this https://youtu.be/O7BI4jfEFwA?si=rjAi5KStXfURl65q recent defcon33 talk shows so much ways you can get training data out. Given all these, it’s hard to believe they are intelligently generating code.
    • p0w3n3d11 hours ago
      Reg. 3 AI is a lossy compression of text indeed. I recommend youtubing "karpathy deep dive LLM" (/7xTGNNLPyMI) - he shows that the open texts used in the training are regurgitated unchanged when speaking to the raw model. It means that if you say to the model "oh say can you" it will answer "see by the dawn's early light" or something similar like "by the morning's sun" or whatever. So very lossy but compression, which would be something else without the given text that was used in the training
  • citizenpaula day ago
    I'm not sure how this is much different then Amazon which has basically monetized the entire Apache Software Foundation and donates a pittance back to them in the single digit millions when they are profiting in the trillions.
    • y0eswddla day ago
      It's not different.

      There's also a huge problem with for-profit companies building on the work of FOSS without contributing resources or knowledge back.

  • AndrewKemendo16 hours ago
    This article could just have been a link to the tragedy of the commons Wikipedia page

    Humans destroying common resources until depleted is a feature not a bug

    • NoraCodes6 hours ago
      This is quite literally the opposite of the tragedy of the commons.
  • fithisux20 hours ago
    Personally I view the usage of AI as fencing.
    • stuaxo20 hours ago
      Thank you for this wonderfully succinct description, I shall steal it.
      • djmips12 hours ago
        without attribution?