Thus, I stand to receive about $9,000 as a result of this settlement.
I think that's fair, considering that two of those books received advances under $20K and never earned out. Also, while I'm sure that Anthropic has benefited from training its models on this dataset, that doesn't necessarily mean that those models are a lasting asset.
This settlement has nothing to do with any criminal liability Anrhropic might have, only tort liability (and it doesn’t involves damages, not fines.)
“Greyball”: https://www.nytimes.com/2017/03/03/technology/uber-greyball-...
My uncle went to jail for picking up someone in an airport in his taxi. He didnt have the airport permit (could only drop off, not pick up). Travis Kalanick industrialized that crime on a grand scale and got billions of dollars instead of jail.
The lesson is clear. Don't make things that don't make money to the already rich.
Remarkably similar, bulk copying of data for other use, except Swartz wanted to make it free vs Anthropic, who wants to make it available via it's "AI" repackaging. One is Federally prosecuted with possibility of decades of jail time and million-dollar fines, the other is a mere civil action.
- Sam Bankman-Fried (FTX): Sentenced to 25 years in prison in 2024 for orchestrating a massive fraud involving the misappropriation of billions in customer funds.
- Elizabeth Holmes (Theranos): Began an 11-year prison sentence in 2023 after being convicted of defrauding investors with false claims about her blood-testing technology.
- Ramesh "Sunny" Balwani (Theranos): The former president of Theranos was sentenced to nearly 13 years in prison for his role in the same fraud as Elizabeth Holmes.
- Trevor Milton (Nikola Corporation): Convicted of securities and wire fraud, he was sentenced to four years in prison in 2023.
- Ippei Mizuhara: The former translator for MLB star Shohei Ohtani was charged in April 2024 with bank fraud for illegally transferring millions from the athlete's account.
- Sergei Potapenko and Ivan Turogin: Convicted in February 2025 for a $577 million cryptocurrency fraud scheme.
- Bernard Madoff: Sentenced to 150 years in prison in 2009 for running the largest Ponzi scheme in history. He died in prison in 2021.
- Jeffrey Skilling (Enron): The former CEO of Enron was sentenced to 24 years in prison in 2006 for fraud and conspiracy. His sentence was later reduced, and he was released in 2019.
- Dennis Kozlowski (Tyco International): The former CEO served over six years in prison after being convicted in 2005 for looting millions from the company.
- Bernard "Bernie" Ebbers (WorldCom): Sentenced to 25 years in prison for orchestrating an $11 billion accounting fraud. He was granted early release in 2019 and died shortly after.
Apart from this list I know Nissan's ex CEO was put into solitary confinement for months.
Who went to prison from Exxon for the Valdez oil spill[1], or from BP for the Deep Water Horizon[2] debacle?
Who went to prison from Norfolk-Southern for the East Palestine train derailment[3]?
Who went to prison from Boeing for the 737Max debacle[4]?
[0] https://en.wikipedia.org/wiki/Bhopal_disaster
[1] https://en.wikipedia.org/wiki/Exxon_Valdez
[2] https://en.wikipedia.org/wiki/Deepwater_Horizon_oil_spill
[3] https://en.wikipedia.org/wiki/East_Palestine%2C_Ohio%2C_trai...
Overly punitive handling of accidents does not lead to better safety-- it primarily leads to people playing the blame game, obfuscating and stonewalling investigations.
This is extremely likely to make the overall situation worse instead of better.
I also think punishment based on outcome is ethically extremely iffy. If you do sloppy work handling dangerous chemicals, your punishment should be for that, and completely independent of factors outside your control that lead to (or prevent) an actual accident.
If someone puts a shopping cart filled with lead-acid batteries on the train tracks causing a derailment and toxic chemicals spill all over the area, poisoning and endangering the people nearby, the person responsible should not go to prison?
Or if someone takes an action knowing that it could crash an airliner with hundreds of people aboard, they should not be imprisoned?
By that logic, if I beat you over the head with a tire iron, I should just walk away. Possibly paying an inconsequential fine?
What's that? The individuals involved in poisoning hundreds/thousands or killing hundreds of airline passengers or beating you to death should be prosecuted and made accountable for their actions?
If that's the case, why should folks who knowingly take steps that create the same results not be treated exactly the same way? Because they were "just following orders" from management? Because their only responsibility is to maximize shareholder value?
Having a limited liability corporation is a privilege, not a right. As such, whether it be knowingly risking the lives and/or environments of others or making the cost/benefit analysis that paying fines/settlements will cost less than operating safely and putting others at risk is acceptable is behavior that should not be acceptable in a civilized society.
As I mentioned in another comment, businesses are strongly motivated by the incentives in their marketplace. If we make knowingly and/or negligently putting others at risk of harm both a death sentence for the corporation and criminal liability for those responsible (which includes management, the board and shareholders), we create the appropriate incentives for corporations to do the right thing.
As it stands now, willful, knowing negligence will usually only result in fines/lawsuits that are a pittance and not much of a drag on earnings. Those are not the right incentives.
If the risk is not excessive, my answer would be no. If the behavior is only realistically punishable when it actually results in an accident, then the answer would also be no.
I think that neither air travel nor chemical plants pose an excessively elevated risk to human lives right now, thus increasing punishments for infractions would be disproportionate, nto very helpful and potentially even detrimental for safety long-term.
Your analogy (beating someone with a tire iron) also clearly features intent; this is not typical for accidents and makes punishments less justifiable and much less useful.
If you actually want to make a strong case for increasing (shareholder) liability, it needs to be clear that those additional punishments and enforcement overhead would actually save lifes, and that very critical point is absolutely not obviouis to me right now.
The above was poorly worded. My apologies. Instead: if someone takes an action knowing that it could crash an airliner with hundreds of people aboard, and that airliner crashes, killing hundreds, they should not be imprisoned?
As for the rest, you are ignoring the modifiers "knowingly" and "making the cost/benefit analysis that paying fines/settlements will cost less than operating safely" and "knowingly and/or negligently putting others at risk of harm" and "willful, knowing negligence"
Why are you ignoring those qualifiers? Did you just miss them? Were they not placed prominently enough in my prose?
I'm not (and I explicitly said so) talking about accidents that are the result of bad luck or a relatively unforeseeable chain of events.
As I repeatedly said, I'm talking about willful, knowledgeable negligence and/or cutting corners knowing safety could be compromised and making conscious decisions to accept known risks of harm to people, property and/or the environment in pursuit of increased profit.
But you knew that, because I repeatedly said so. As such, why are you arguing with a strawman you set up rather than engaging with my actual statements?
A great example of what I'm talking about is the continued sale of HIV contaminated blood products by multiple pharmaceutical companies who knew their products were contaminated with HIV[0], were already selling products that were uncontaminated, but knowingly sold the contaminated products and thousands of hemophiliacs died slow, painful deaths from AIDS (including my brother in-law).
And (for the eighth or ninth time) that's the sort of thing I'm talking about.
Those pharmaceutical companies intentionally murdered thousands by knowingly selling contaminated blood products to people who needed those products to live -- even though they had uncontaminated products in inventory. They just wanted to make more money by infecting ~80% of US hemophiliacs and many more around the world.
Perhaps you'll claim it was just "poor process" or "deficient training" or something else equally ridiculous. But it wasn't any of those things. The pharma companies admitted doing so.
How many executives went to prison, or the companies had their charters revoked? Zero. That's the problem. That's the kind of behavior that literally screams for prison time and the corporate death penalty.
Do I need to provide eight or nine more examples before you'll stop being deliberately obtuse? You're being an apologist for sociopathic, murderous scum. For shame!
[0] https://en.wikipedia.org/wiki/Contaminated_haemophilia_blood...
I did notice. Which is why my list included mass deaths and massive pollution/ecological destruction, some of which we still don't know what the eventual damage/death toll will be.
And that's the bigger issue: property crimes are considered more serious than mass murder and poisoning our world. Just as with the fraudsters, the corporate veil should have been pierced for the murderers and despoilers of our environment, with harsh prison sentences for those whose avarice and sociopathy allowed them to murder and despoil.
Civil liability is fine, and the "corporate death penalty" (revoking charters, barring directors/managers from future employment, etc.) should be invoked with extreme prejudice in those circumstances as well.
But we don't do that. Because corporations are, in the above circumstance, not "people", but a legal fiction protecting its owners from liability. But when it benefits the corporation and its owners/managers, a corporation is a "person."
I'd say we should work it the other way -- if a corporation is responsible for deaths and despoilation, all the owners should have a share in the punishment.
That way, after a few thousand wealthy individual investors and the owners of a few dozen hedge funds/investment houses are put in SuperMax for a decade or two for the misdeeds of the companies in which they've invested. And let's not make the boards of directors, C-Suite and any others directly involved feel left out either. They can commiserate with their fellow scumbags in the prison yard.
That does sound pretty harsh doesn't it? Perhaps too harsh? I don't think so. Because as we're constantly reminded, business responds strongly to incentives.
And if businesses are strongly incentivized to not poison our citizens, kill airplane passengers and destroy our environment with the threat of long prison sentences and a stripping of their assets, I'd expect they'd respond to such incentives.
But, as it is now, when the incentives are to privatize profit and hold harmless those who kill us, make us sick and destroy our environment, those are the incentives to which corporations will respond.
I'm not really big on incarceration but I broadly agree.
I wasn't confused. I was on exactly the same page as you.
You comment just prompted me to respond with my own thoughts.
It's all good.
>I'm not really big on incarceration but I broadly agree.
I'm not generally huge on it either (I think we over-incarcerate in the US), but as I mentioned, having strong incentives is important to guide corporate behavior. Besides, if an individual (and especially a poor one) caused a train derailment or dumped battery acid in the drinking water causing sickness or death, or sabotaged a plane so that it crashed, you bet your ass they'd be incarcerated.
Why shouldn't we have the same standards for corporations and the wealthy?
If you want to neutrally answer a rhetorical question in the context of a debate, you're going to have to disclaim that somehow. Otherwise, there's no way for us to know, and the comment walks and talks like an argument.
It wasn't a rebuttal of your comment so much as I saw it as an opportunity to show the double standard in play WRT the consequences of ripping off wealthy folks vs. destroying the environment and/or outright maiming and killing people.
I didn't downvote your post either. Although if I'd noted that you "Asked AI", I might well have done so. To be clear, that's not a jab at you personally. Rather, I come to HN to discuss stuff with the other users, not read LLM generated text. If that's what I wanted, I don't need to come here, do I?
Sadly there's more and more of that here, with many folks not even saying they used an LLM (I use that term because "AI" doesn't actually exist) to generate their comment. I appreciate that you did so. Thanks!
> - Elizabeth Holmes (Theranos): Began an 11-year prison sentence in 2023 after being convicted of defrauding investors with false claims about her blood-testing technology.
many from your list went to jail because they robbed rich, and not poor.
John Doe McDrugUser 2
John Doe McDrugUser 3
John Doe McDrugUser 4
John Doe McDrugUser 5
John Doe McDrugUser 6
John Doe McDrugUser 7
You should google "actor arrested for drugs." I think you may find the seven names you need. Or 700.
To actually get convicted of anything as a corporate officer, you have to have substantially defrauded your own shareholders, who are senior to the public's interest in justice. Most such crimes involve financial malfeasance.
1. Hit them with fines or punitive damages high enough to wipe out all their operating profit and executive pay for as many years as a person would be in prison.
2. Seize the company (retainership?), replace its executives, and make the new leaders sign off to not do that thing again. That's in addition to a huge fine.
3. Dissolve it. Liquidate its assets.
They usually just let the big companies off while throwing everything they have at many individuals who aren't corporations.
For settlement-type deals, maybe see if they'll give all authors they ripped off free access to Claude models, too. They reap the benefits of what was produced. At cost with certain amount of free credits.
And please don't assume a "you wouldn't if it was your own employer" - no, I very much would, despite the struggles it would cause.
Give the government partial ownership. This dilutes the other owners and ties them do the government. This gives the government more 'oversight' power over the business, just like jail. Give the government an oversite seat on the board.
There are many ways you can put a business in jail, we're just told you can't because that would inconvenience the current business models of breaking the laws/rules/obligations to 'streamline' business and 'innovate'.
I don’t get fined 7000USD for illegally downloading 3 books for example, much less. Although if I’m a repeat offender it can go up to prison I think.
> Statutory penalties are found at 18 U.S.C. § 2319. A defendant, convicted for the first time of violating 17 U.S.C. § 506(a) by the unauthorized reproduction or distribution, during any 180-day period, of at least 10 copies or phonorecords, or 1 or more copyrighted works, with a retail value of more than $2,500 can be imprisoned for up to 5 years and fined up to $250,000, or both. 18 U.S.C. §§ 2319(b), 3571(b)(3).
If you broaden it to include DMCA violations you could spend a lot of time in jail. It's even worse in some other countries.
With a typical torrenter, it would be straightforward to make some truly monumental penalties.
The reality is, they rarely care.
Granted, the motivation was the copyright infringement, but to do what they did they needed to dress it up.
And this is why it is correct to say that he was persecuted for copyright infringement. Noting that he wasn't charged with anything related to copyright doesn't change the story, it only makes it less agreeable.
Can't help but feel the reporting about $3000/work is going to leave a lot of authors disappointed when they receive ~$2250 even if they'd have been perfectly happy if that was the number they initially saw.
It may be fair to you but how about other authors? Maybe it's not fair at all to them.
I don't think $3k is likely a bad deal, but I still think you're over simplifying things.
> the legal fees are almost certainly being paid on contingency and not out of pocket.
The legal fees for this lawsuit. Not the legal feels for anyone who went and talked to a lawyer suspecting their material was illegitimately used.You're treating the system as isolated when it is not.
> no opportunity cost or lost future income here because this is piracy not theft.
I think you are confused. Yes, it is piracy but not like the typical piracy most of us do. There's no loss in pirating a movie if you would never have paid to see the movie in the first place.But there's future costs here as people will use LLMs to generate books, which is competition. The cost of generating such a book is much cheaper, allowing for a much cheaper product.
> They only lost the revenue from the sale of a single copy.
In your effort to simplify things you have only complicated them. > You are not entitled to protection from future competition
What do you think patents, copyright, trademarks, and all this other stuff is even about?There's "Statutory Damages" which account for a wide range of things[0].
Not to mention you just completely ignored what I argued!
Seriously, you've been making a lot of very confident claims in this thread and they are easy to verify as false. Just google some of your assumptions before you respond. Hell, ask an LLM and they'll tell you! Just don't make assumptions and do zero amount of vetting. It's okay to be wrong, but you're way off base buddy.
"Future competition" is a loosely worded way of saying this.
| Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.[0]
Please don't be disingenuous. You know that none of the authors were selling their books for $3k a piece, so obviously this is about something more > because of Anthropic's stupidity in not buying the books.
And what about OpenAI, who did the same thing?What about Meta, who did the same thing?
What about Google, who did the same thing?
What about Nvidia, who did the same thing?
Clearly something should be done because it's not like these companies can't afford the cost of the books. I mean Meta recently hired people giving out >$100m packages and bought a data company for $15bn. Do you think they can't afford to buy the books, videos, or even the porn? We're talking about trillion dollar companies.
It's been what, a year since Eric Schmidt said to steal everything and let the lawyers figure it out if you become successful?[1] Personal I'm not a big fan of "the ends justify the means" arguments. It's led to a lot of unrest, theft, wars, and death.
Do you really not think it's possible to make useful products ethically?
[0] https://news.ycombinator.com/newsguidelines.html
[1] https://www.theverge.com/2024/8/14/24220658/google-eric-schm...
One of the consequences of retaining their rights is that they can also sue Meta and Google and OpenAI etc for the same thing.
If there's evidence of this that will stand up in court, they should be sued as well, and they'll presumably lose. If this hasn't happened, or isn't in the works, then I guess they covered their tracks well enough. That's unfortunate, but that's life.
> Clearly something should be done because it's not like these companies can't afford the cost of the books
Yes indeed it should, and it has. They have been forced to pay $3000 per book they pirated, which is more than 100x what they would have gained if they had gotten away with it.
IMO a fine of 100x the value of a copy of the pirated work is more than sufficient as a punishment for piracy. If you want to argue that the penalty should be more, you can do that, but it is completely missing my point. You are talking about what is fair punishment to the companies, and my comment was talking about what is fair compensation to the authors. Those are two completely different things.
Anti-piracy groups use scare letters on pirates where they threaten to sue for tens of thousands of dollars per instance of piracy. Why should it be lower for a company?
Yes. Nemotron:
https://www.nvidia.com/en-gb/ai-data-science/foundation-mode...
Torrenting:
Meta Pirating Books[1,2,3]
- [1] Fun fact, [1] is the most popular post of all time on HN for the search word "torrent" and the 5th ranking for "Meta". [2] is the 16th for "illegal"
Nvidia [4,5]
Apple, Nvidia, Anthropic[6]
GitHub [7,8]
OpenAI [9,10]
Google [11]
- I mean this one was even mentioned in the articled from the Anthropic post from a few days ago[12]
I hope that's sufficient. You can find plenty more if you do a good old fashion search instead of just using the HN search. But most of these were pretty high profile stories so was pretty quick to look. > which establishes president that training an LLM is fair use.
~~~~~~~~~
precedent
I think you misunderstand. The precedent is over the issue of piracy. This has not made precedence over the issue of fair use. There is ongoing litigation, but there was precedence set in another lawsuit with Meta[13], which is currently going through appeals. I'll give you a head start on that one [14,15]. But the issue of fair use is still being debated. These things take years and I don't think anyone will be surprised when this stuff lands in some of the highest courts and gets revisited in a different administration. > IMO a fine of 100x the value of a copy of the pirated work is more than sufficient as a punishment for piracy.
Sure. You can have whatever opinion you want. I wasn't arguing about your opinion. I even agreed with it[16]!But that is a different topic all together. I still think you've vastly over simplified the conversation and thus unintentionally making some naive assumptions. It's the whole reason I said "probably" in [16]. The big difference being just that you're smart enough to figure out how law works and I'm smart enough to know that neither of us are lawyers.
And please don't ask me for more citations unless they are difficult to Google... I think I already set some kinda record here...
[0] https://archive.is/3oCg8
[1] https://news.ycombinator.com/item?id=42971446
[2] https://news.ycombinator.com/item?id=43125840
[3] https://news.ycombinator.com/item?id=42772771
[4] https://news.ycombinator.com/item?id=40505480
[5] https://news.ycombinator.com/item?id=41163032
[6] https://news.ycombinator.com/item?id=40987971
[7] https://news.ycombinator.com/item?id=33457063
[8] https://news.ycombinator.com/item?id=27724042
[9] https://news.ycombinator.com/item?id=42273817
[10] https://news.ycombinator.com/item?id=38781941
[11] https://news.ycombinator.com/item?id=11520633
[12] https://news.ycombinator.com/item?id=45142885
[13] https://perkinscoie.com/insights/update/court-sides-meta-fair-use-and-dmca-questions-leaves-door-open-future-challenges
[14] https://arstechnica.com/tech-policy/2025/07/meta-pirated-and-seeded-porn-for-years-to-train-ai-lawsuit-says/
[15] https://torrentfreak.com/copyright-lawsuit-accuses-meta-of-pirating-adult-films-for-ai-training/
[16] https://news.ycombinator.com/item?id=45190232
This is what generative AI essentially is.
Maybe the payment should be $500/h (say $5k a page) to cover the cost of preparing a human verified dataset for anthropic.
Thus the $3k per violation is still punitive at (conservatively) 100x the cost of the book.
Given that it is fair use, Authors do not have rights to restrict training on their works under copyright law alone.
Don't get me wrong: I think this is in incredibly bad deal for authors. That said, I would be horrified if it wasn't treated as fair use. It would be incredibly destructive to society since people would try to use such rulings to chissel away at fair use. Imagine schools who had to pay yearly fees to use books. We know they would do that, they already try to do so (single use workbooks, online value added services). Or look at software. It is already going to be problematic for people who use LLMs. It is already problematic due to patents. Now imagine what would happen if reformulating algorithms that you read in a book was not considered as fair use. Or look at books themselves. A huge chunk of non-fiction consists of doing research and re-expressing ideas in non-original terms. Is that fair use? The main difference between that and a generative AI is we can say a machine did it in the case of generative AI, but is that enough to protect fair use in the conventional sense?
I feel like we aren't far from that. Wouldn't be surprised if new books get published (in whatever medium) that are licensed out instead of sold.
…especially given the US “fair use” doctrine takes into account the effect that a particular use might have on the market for similar works, so the authors are bound to argue that the existence of AI that can reproduce fanfiction-like facsimiles of works at scale is going to poison the well and reduce the market for people spending actual money on future works (whether or not that’s true is another question).
So in my view the court is going to say that buying a book doesn’t give them the right to train on the contents because that is mechanical reproduction which is explicitly disallowed by the copyright notice and they don’t fall under the “fair use” carveout because they affect the future market. There isn’t anywhere else where they were granted the right to use the authors’ works so the work is disallowed. Obviously no court finding is ever 100% guaranteed but that really seems the only logically-consistent conclusion they could come to.
I thought that they didn't use this data for training, the "crime" here was making the copies.
>I think that's fair, considering that two of those books received advances under $20K and never earned out.
i don't understand your logic here, if they never earned out that means you were already "overpaid" compared to what they were worth in the market. shouldn't fairness mean this extra bonus goes first to cover the unmet earnout?
1. Getting the maximum statutory damages for copyright infringement, which would be something like &250,000 per instance of infringement; you can be generous and call their training and reproduction of your works as a single instance, though it’s probably many more than that. 2. An admission of wrongdoing plus withdrawal from the market and permanent deletion of all models trained on infringed works. 3. A perpetual agreement to only train new models on content licensed for such training going forward, with safeguards to prevent wholesale reproduction of works.
It’s no less than what they would do if they thought you were infringing their copyrights. It’s only fair that they be subject to the same kind of serious penalties, instead of something they can write off as a slap on the wrist.
Publishers get exclusive print publishing rights for a given market, typically get digital and audio publication rights for the same, and frequently get a handful of other rights like the ability to license it for publication in other markets. But ownership of the work is almost always retained by the author.
Perhaps tokenize all of the books and assign proportionally for token count of each publication.
If their works are so insignificant to Claude, just remove them from Claude and retrain. It's that simple. Claude is 100% the product of the works that were stolen, just as much as it is the product of the software developers and investors.
All the employees just got a million dollar bonus. The authors spent their entire lives up to the point of publishing each work refining their ideas, investing in education and experiences that allowed them to create their works that were stolen. If the result of that theft is a massive wealth-building AI machine, it makes sense that they should benefit from the ongoing use of their work.
Anthropic had a chance at licensing these works the standard way, and chose to skip it. Do you think the consequences of this should be that under court order they receive money that is likely the same or less than they would have negotiated had Anthropic not stolen from them?
Why would anyone hesitate to steal other publications if their only punishment is that on the off-chamce they get caught, they have to pay what they would have anyway? That's the ridiculous assertion.
> 100-1000x the value of their books
They would be getting 5% the value of their books. Claude without those books is nothing. Take all the books away from Claude and you can understand the true value they contributed.
Doesn't that mean the money should go to your publisher instead of you?
It remains to be seen, but typically this forms a moat. Other companies can't bring together the investment resources to duplicate the effort and they die.
The only reasons why this wouldn't be a moat:
1. Too many investment dollars and companies chasing the same goal, and none of them consolidate. (Non-consolidation feels impractical.)
2. Open source / commoditize-my-complement offerings that devalue foundation models. We have a few of these, but the best still require H100s and they're not building product.
I think there's a moat. I think Anthropic is well positioned to capitalize from this.
It is just another opinion.
It is not about 9k for your knowledge in that book. Is 9k for taking you out. The faster they are able to grab data and process the less chance you have to make money from your work.
The money is irrelevant if we allow them to break the law. They even might pay you 9k for those books, but you might never get anything again because they would have made copyright useless
Where can I check if I'm eligible?
Well this is true
Infringement was supposed to imply substantial similarity. Now it is supposed to mean statistical similarity?
The suit isn't about Anthropic training its models using copyrighted materials. Courts have generally found that to be legal.
The suit is about Anthropic procuring those materials from a pirated dataset.
The infringement, in other words, happened at the time of procurement, not at the time of training.
If it had procured them from a legitimate source (e.g. licensed them from publishers) then the suit wouldn't be happening.
The portion the court said was bad was not Anthropic getting books from pirated sites to train its model. The court opined that training the model was fair use and did not distinguish between getting the books from pirated sites or hard copy scans. The part the court said was bad, which was settled, was Anthropic getting books from a pirate site to store in a general purpose library.
--
"To summarize the analysis that now follows, the use of the books at issue to train Claude
and its precursors was exceedingly transformative and was a fair use under Section 107 of the
Copyright Act. And, the digitization of the books purchased in print form by Anthropic was.
also a fair use but not for the same reason as applies to the training copies. Instead, it was a
fair use because all Anthropic did was replace the print copies it had purchased for its central
library with more convenient space-saving and searchable digital copies for its central
library — without adding new copies, creating new works, or redistributing existing copies.
However, Anthropic had no entitlement to use pirated copies for its central library. Creating a
permanent, general-purpose library was not itself a fair use excusing Anthropic’s piracy."
"Because the legal issues differ between the *library copies* Anthropic purchased and
pirated, this order takes them in turn."
--Questions
As an author do you think it matters where the book was copied from? Presumably, a copyright gives the author the right to control when a text is reproduced and distributed. If the AI company buys a book and scans it, they are reproducing the book without a license, correct? And fair use is the argument that even though they violated the copyright, they are execused. In a pure sense, if the AI company copied (assuming they didn't torrent back the book) from a "pirate source" why is that copy worse then if they copied from a hard book?
isn't digitizing your own copies as backups and personal use fine? so long as you dont give away the original while keeping the backups. similarly, dont give away the digital copies.
No? I think there are a lot more details that need to be known before answering this question. It matters what they do with it after they scan it.
Yes
> it means that yes what I did was technically a violation but is forgiven
Not at all. All "affirmative defence" means is that procedural the burden is on me to establish that I was not violating the law. The law isn't "you can't do the thing", rather it is "you can't do the thing unless its like this". There is no violation, there is no forgiveness as there is nothing to forgive, because it was done "like this" and doing it "like this" doesn't violate the law in the first place.
https://www.documentcloud.org/documents/26084996-proposed-an...
> reproducing purchased and scanned books to train AI constituted fair use
Library Genesis has one copy. It then sends you one copy and keeps it's own. The entity that violated the _copy_right is the one that copied it, not the one with the copy.
Of course, American law is different. But is it the case that copies made for the purpose of using illegally obtained works are not infringing?
Well, the question here is "who made the copy?"
If you advertise in seedy locations that you will send Xeroxed copies of books by mail order, and I order one, and you then send me the copy I ordered, how many of us have committed a copyright violation?
> "Who made the copy?"
This begs the question. With digital media everybody involved makes multiple copies.
There was no issues with the physical copies of books they purchased and scanned.
I believe the issue of USING these texts for AI training is a separate issue/case(s)
The entire point of deep learning is to copy aspects from training materials, which is why it’s unsurprising when you can reproduce substantial material from a copyrighted work given the right prompts. Proving damages for individual works in court is more expensive than the payout but that’s what class action lawsuits are for.
Given that books can be imitated by humans with no compensation, this isn't as strong as an argument as you think. Moreover AFAIK the training itself has been ruled legal, so Anthropic could have theoretically bought the book for $20 (or whatever) and be in the clear, which would obviously bring less revenue than the $9k settlement.
So you're agreeing with me? The courts have been pretty clear on what's copyrightable. Copyrights only protect specific expressions of an idea. You can copyright your specific writing of a recipe, but not the concept of the dish or the abstract instructions itself.
(They can still sue for damages, but they can't claim copyright over your game itself.)
Or you could sue him on a theory of unjust enrichment, in which case, if he lost, he'd owe you nothing, and if he won, he'd owe you all of his winnings.
It's not clear to me why the same theory wouldn't be available to Adobe, though the copyright question wouldn't be the main thrust of the case then.
Thus, isn't the settlement essentially Anthropic admitting that they don't really have an effective defense against the piracy claim?
The authors can still sue for damages though (and did, and had a strong enough case Anthropic is trying to settle for over a billion dollars).
But otherwise, you're essentially asking if you can somehow bypass license agreements by simply refusing to read them, which would obviously render all licensing useless.
In the event that you try to play games to get around that acknowledgement: Courts aren't machines, they can tell that you're acting in bad faith to avoid license restrictions and can punish you appropriately.
> Most paid software generally makes you acknowledge that you have read and accepted the terms of the license before first use, and includes a clause that continued use of the software constitutes acceptance of the license terms.
Huh. If only I'd known that.
Why do you think that is?
Why did you think it made sense to respond to the question "Why do you think X is true?" with "Did you know that X is true?"?
But copyright was based on substantial similarity, not causal links. That is the subtle change. Copyright is expanding more and more.
In my view, unless there is substantially similarity to the infringed work, copyright should not be invoked.
Even the substantial similarity concept is already an expanded concept from original "protected expression".
It makes no sense to attack gen-AI for infringement, if we wanted the originals we would get the originals, you can copy anything you like on the web. Generating bootleg Harry Potter is slow, expensive and unfaithful to the original. We use gen-AI for creating things different from the training data.
Copyright isn’t supposed to apply if you happen to write a story that bares an uncanny similarity to a story you never read written in 1952 in a language you don’t know that sold 54 copies.
And in general, when an LLM is able to recreate text that's a training error. Recreating text is not the purpose. Which is not to excuse it happening, but the distinction matters.
Real-world absurd example: A company hires a bunch of workers. They then give them access to millions of books and have the workers reading the books all day. The workers copy the books word by word, but after each word try to guess the next word that will appear. Eventually, they collectively become quite good at guessing the next word given a prompt text, even reproducing large swaths of text almost verbatim. The owner of company Y claims they owe nothing to the book owners, because it doesn't count as reading the book, and any reproduction is "coincidental" (even though this is the explicit task of the readers). They then use these workers to produce works to compete with the authors of the books, which they never paid for.
It seems many people feel this is "fair use" when it happens on a computer, but would call it "stealing" if I pirated all the books of JK Rowling to train myself to be a better mimicker of her style. If you feel this is still fair use, then you should agree all books should be free to everyone (as well as art, code, music, and any other training material).
Can you provide an example of someone being successfully sued for "mimicking style", presumably in the US judicial system?
Music has had this happen numerous times in the US. The distinction isn’t an exact replica, it’s if it could be confused for the same style.
George Harrison lost a case for one of his songs. There are many others.
https://ultimateclassicrock.com/george-harrison-my-sweet-lor...
I won't rehash the many arguments as to why the output is also a violation, but my point was more the absurd view that stealing and using all the data in the world isn't a problem because the output is a lossy encoding (but the explicit training objective is to reproduce the training text / image).
However, AI has been show to copy a lot more than what people consider style.
That's called extreme overfitting. Proper training is supposed to give subtle nudges toward matching each source of text, and zillions of nudges slowly bring the whole thing into shape based on overall statistics and not any particular sources. (But that does require properly removing duplicate sources of very popular text which seems to be an unsolved problem.)
So your analogy is far enough off that I can't give it a good reply.
> It seems many people feel this is "fair use" when it happens on a computer, but would call it "stealing" if I pirated all the books of JK Rowling to train myself to be a better mimicker of her style.
I haven't seen anyone defend the piracy, and the piracy is what this settlement is about.
People are defending the training itself.
And I don't think anyone would seriously say the AI version is fair use but the human version isn't. You really think "many people" feel that way?
To generate working code the output must follow the API exactly. Nothing separates code and natural language as far as the underlying algorithm is concerned.
Companies slightly randomize output to minimize the likelihood of direct reproduction of source material, but that’s independent of what the neural network is doing.
And it's not really about randomizing output. The model gives you a list of likely words, often with no clear winner. You have to pick one somehow. It's not like it's taking some kind of "real" output and obfuscating it.
It’s very rare for multiple outputs to actually be equal so the only choice is to choose one at random. Instead its become accepted practice to make sub optimal choices for a few reasons, one of which really is to decrease the likelihood of reproducing existing text.
Nobody wants a headline like: “Meta's Llama 3.1 can recall 42 percent of the first Harry Potter book” https://www.understandingai.org/p/metas-llama-31-can-recall-...
While I'm sure it feels good and validating to have this called copyright infringement, and be compensated, it's a mixed blessing at best. Remember, this also means that your works will owe compensation to anyone you "trained" off of. Once we accept that simply "learning from previous copyrighted works to make new ones" is "infringement", then the onus is on you to establish a clean creation chain, because you'll be vulnerable to the exact same argument, and you will owe compensation to anyone whose work you looked at in learning your craft.
This point was made earlier in this blog post:
https://blog.giovanh.com/blog/2025/04/03/why-training-ai-can...
HN discussion of the post: https://news.ycombinator.com/item?id=43663941
Now don't take me wrong, I'm not saying that a rushed regulatory response is a good thing, it's more about the delivery of your reply. I see those arguments a lot: people smugly saying "Well, YOU too learn from things, how about that? Not so different from the machine huh?" and then continuing the discussion based on that premise, as if we were supposed to accept it as a fact.
Only when big-corp critics needed another pretense to support a conclusion they long agreed with for other reasons, did they decide that the right to learn from your exposure to a copyright work was infringement.
If you're interested in the similarities and genuinely curious, you could look at the article linked above, which shows how both LLMs and humans store a high-level understanding of their training set. It's a way deeper parallel than "it's all learning" -- but you have to be willing to engage with the other side rather than just strawman it.
It literally is not. If the defense for your copyright infringement is "my machine is actually the same as a human" then it's your obligation to substantiate that argument.
If human beings were working for Anthropic, training, and then being contracted out by Anthropic, the exact same rules would apply as historically. Anthropic is not being unfairly treated.
Small detail.
Either way, you seem to be asking for a genuine conversation so I'll take this a bit more seriously.
I hope you will forgive me for not engaging with the entire article top-to-bottom right now. For time's sake, I've queried ChatGPT to extract quotes from the article related to your main point (the similarity between human thinking and LLM predictions) so that I can ctrl+F them. This is a well written and organized article, so I believe that even looking at disconnected sections should give me a clear view of the argument that's being made.
---
From the section: “Understanding” in tools.
The author rightfully spends some time disambiguating his use of "understanding", and comparing traditional scripting to LLMs, drawing a parallel with human processes like mathematics and intuition (respectively). The section ends with this quote:
> That’s why I think that the process of training really is, both mechanically and philosophically, more like human learning than anything else
Which is the meat of the argument you're making to say that both should be evaluated in the court.
I can easily tell when reading this that the author is skillful and genuine in his concerns over IP and its misuses, and I really respect that. The issue I raise with the whole argument however did not change from my initial response. While he does have a good understanding of how LLMs operate.
↑↑↑↑↑↑ At this point in writing my reply, I then planned to call into questions his credentials in the other main expertise, namely brain science, as I most often saw this argument come from tech people and less so brain scientists. What I found instead was not the ultimate own I hoped for, but rather a mixed bag of things that really are similar[1] and other articles that expressed some big differences in other aspects[2]. As such, I cannot in good faith say that your argument is unsubstantiated, only that brain science (an expertise in which I have absolutely no authority) is still torn on the subject.
---
Doesn't mean my first reply is suddenly null and void. If I can't prove or disprove "LLMs and humans think alike", I can still discuss the other conclusion you (and the article) draw from it "-> ...and thus, should and will be treated equally in the eyes of the law". This brings yet another expertise (law) that I am woefully unqualified to talk about, but I need to ask: why would that be the "only natural" conclusion? I'll refer to your other reply:
> laws can be arbitrary, and ignore constraints like consistency! It’s just something sane people try to avoid.
You look at the inconsistencies in law like they're a design flaw, but are they really? Law is meant to accommodate humans, society is a system with an amount of edge cases that I cannot possibly imagine.
In the very next section of the article called "Training is not copying", he calls out inaccurate uses of the world "reproduction" and "storing", he also cites another article, which I'll quote:
> The complaint against Stable Diffusion characterizes this as “compressing” (and thus storing) the training images, but that’s just wrong. With few exceptions, there is no way to recreate the images used in the model based on the facts about them that are stored. Even the tiniest image file contains many thousands of bytes; most will include millions. Mathematically speaking, Stable Diffusion cannot be storing copies …
This reads to me like arguing semantics, yes, most artists yelling out in outrage do not know the ins and outs of training a diffusion model. But I don't think this completely annihilates or addresses their concerns.
When people say "it's not technically reproduction", it doesn't stop the fact that today, LORAs exists to closely imitate the artstyle with much less training resources, and in the case of LORAs, it's not "a vast, super diluted training set", it's "super fine tuning based on an existing model, and an additional (but much smaller) batch of training data directly taken from (and laser-focused on) a specific person.
Now do I know what would happen if [patreon-having-guy] tries to take someone to justice because he made a LORA to specifically target him? I do not, I haven't checked a legal precedent for this but when there will be, it's a decision that will be taken by humans in a court. (As for what will immediately happen, he will Streisand his way to 27 other dudes doing the same thing out of spite)
I got a bit sidetracked, but all of that is to say, law is by people for people. In the end, there's nothing that tells us whether or not "LLMs and Humans think the same" will directly translate to "...so LLMs shall be treated like humans in the court".
The LLM can't go to prison, it can't make money to pay for damages, having an ironclad rule like that would just make things less convenient. Code is not law, and (thankfully) law is not code. I feel like some people (I'm not saying YOU did it) advocating for "treating LLMs as humans" do so as a means to further alleviate corporate responsibility for anything.
All in all, I'm don't just question the parallel, I question "why" the parallel, "why" in this discussion. For the author of your article, I can easily see that he IS genuine with his concerns about IP and the consequences of a Kneejerk regulatory response to it.
Your initial reply in the context of this thread on the other hand? Correct me if I'm wrong, but it reads like a taunt. It reads like "we'll see who gets the last laugh", so forgive me if I assumed that wrongly, because this was the reason my first reply was the way it was.
---
One last thing that I have to get out of my system (a tiny rant if you will). I feel like there is an attitude problem in the expertise of tech regarding... quite literally any other craft. I suppose it exists in other fields but this is where I see it the most because I'm also in this field.
The topic we've been discussing is an intersection between tech, brain science, and law, a single of those fields is already very hard, you could dedicate your life to it and still learn more things. Yet when it comes to the "LLM = humans" debate, it seems like everyone suddenly has all the qualifications required, never mind that people dedicating their life to brain science are still saying "we don't fully get it", never mind that people who spend their life in law have yet to experience and set a precedent for the whole shift that's going to happen, tech people talk as if tech is the only thing that's needed to make the world turn.
Generative tech has exacerbated (or expanded) this attitude to even more fields, I don't think it's any surprise that there is such animosity between the tech and creative people when the guy that is spearheading generative music says "people don't enjoy making music", when all the communication around it is "adapt or die", "we figured it out", "we SOLVED art", "you will be left behind", and then calling anyone that does not agree a luddite.
The reason I replied is not because I want IP laws to tighten, nor because I genuinely believe we could "get rid of AI" (btw AI is a blanket term that makes things worse for everyone discussing it), you were just unlucky enough to be the n-th person to bring up that argument I've seen many times before on a night where I had some free time.
So thanks for giving me the occasion to write that down. I do not think this threads warrants either of us to show too much hostility, but as you said, the whole conversation about current-day genAI touches on so much more than just genAI, it's very easy to find something about it that annoys someone on either side.
[1] https://www.brown.edu/news/2025-09-04/ai-human-learning
[2] https://www.ox.ac.uk/news/2024-01-03-new-research-shows-way-...
It has never been a part of copyright to include the right to be influenced by the copyright work. Period. It's been the diametric opposite. Copyright as always existed, and been justified, as a way to get good, new works out into the public, so that later works can be influenced by them. The fact that one work was influenced by another has never, by itself, been a reason to consider it infringement. Not until 2022, when AIs actually got good at it.
When you argue for AI training as copyright infringement, you're saying that "the fact that your work was influenced by previous works, means you owe license fees". This is wholly without precedent[1], and was widely rejected until the moment some activists realized it could be a legal tool against Bad People. It's a Pandora's Box no one really wants (except perhaps very large media companies who will be able to secure general "learning licenses" for mass libraries of works). That was the the point emphasized in my original comment: If Anthropic is infringing because they base new works on old ones, so are you. You too owe licensing fees for every work you observed that fed into how you create. If that feels like an expansion of what copyright is supposed to cover ... that's the point.
For every single work of literature, you can go back and say "aha, this is clearly influenced by X, Y, Z". Precisely zero people were going out insisting that the author therefore owed fees to all those other creators, because the idea is absurd. Or was, until 2022, when some people needed a pretense for a conclusion they long supported for unrelated reasons ("Facebook must suffer"). So I think my second paragraph is justified.
"If you read 100 horror novels and write a new one based on everything you've noticed in them, you don't owe the authors jack squat. But if you have a machine help you compile the insights, suddenly, you've infringed the authors' rights." Yeah, you do need to justify that.
[1] I agree there are going to be cases where it, say, was captured too closely, but not for the general case, and it's further weakened when it's "imitating" a thousand styles at once.
[0] not because we're so amazingly more creative. But because copyright is a legal invention, not something derived from first principles, and has been defined to only apply to human creations. It could be changed to apply to LLM output in the future.
It's not, because LLMs are not making new copyrightable works.
To make a copyrightable work you must put some creative act into it. Just copying someone else's work does not enable you to claim copyright. But LLMs cannot put creative work into their works, because only humans are capable of copyrightable creation - so therefore it is infringing.
The fact AI proponents can't see that is insane. Reminds me of the quote:
"It is difficult to get a man to understand something, when his salary depends upon his not understanding it."
Name should sound familiar to those who follow tech law; he presided over Oracle v Google, along with Anthony Levandowski's criminal case for stealing Waymo tech for Uber.
His orders and opinions are, imo, a success story of the US judicial system. I think this is true even if you disagree with them
They tried to say `rangeCheck(length, start, end)` was novel. He spat back that he'd written equivalent utility functions as a hobbyist hundreds of time!
The Supreme Court decision in Oracle v Google skipped over copyrightability and addressed fair use. Fair use is a legal defense, applying only in response to finding infringement, which can only be found if material's copyrightable. So the way the Supreme Court made its decision was weird, but it wasn't about the creativity requirement.
I do wonder if all of the kinks will be smoothed out in time. Not a lawyer too, but the timeline to create the longer list is a bit tight, and generally feels like we could see an actual rejection or at least a stretched out process here that goes on for a few more months at least before approval.
Edit: My stance on information freedom and copyright hasn't changed since Aaron Swartz's death in 2013. Intellectual property laws, patents, copyright, and similar protections feel outdated and serve mainly to protect established interests. Despite widespread piracy making virtually all media available immediately upon release, content creators and media companies continue to grow and profit. Why should publishers rely on century-old laws to restrict access?
Moreover, IP law protects plenty of people who aren’t “established interests”. You just, perhaps, don’t know them.
However in most cases that money ultimately comes from being able to sell proprietary software and software-enhanced services. Many employers wouldn't pay for free software, if it wasn't helping their closed-source tech.
If bigger companies can enforce their "right"s as the owner of intellectual property, the smaller ones and individuals should be able to do so as well.
I discussed this rather recently in HN. The timelines for copyright is too long. They need to get shorter around 20-30 years for actual creative work.
I think software needs its own category of intellectual property. It should enjoy at most 10 years. Software is quite akin to machinery and mechanical designs can only get 20 years of patent protection. Considering the fast growing and changing nature, software should get even shorter IP protection. Similar to every other sector, trade secrets can continue to exist and employers can negotiate deals with software engineers for trade secret protection.
The number of bizarre, contradictory inferences this settlement asks you to make - no matter your stance on the wider question - is wild.
The existing ruling in the case establish "persuasive" (i.e. future cases are entirely free to disagree and rule to the contrary) precedent - notably including the part about training on legally acquired copies of books (e.g. from a book store) being fair use.
Only appeals courts establish binding precedent in the US (and only for the courts under them). A result of this case settling is that it won't be appealed, and thus won't establish any binding precedent one way or another.
> The number of bizarre, contradictory inferences this settlement asks you to make - no matter your stance on the wider question - is wild.
What contradictions do you see? I don't see any.
> What contradictions do you see? I don't see any.
I guess us seeing very different things is also what a settlement might be for :-).
But I think I was wrong.
I think others in the thread are debating the contradictions I saw. I tried typing them out when I made my earlier comment, but couldn't get them to fix to any kind of logic that made sense to me. They just seemed contradictory, at the time.
I think the same arguments have now been made much more clearly by others - specifically around whether a corporation downloading this work is the same as a human downloading it - and the responses have been very clear also.
The settlement figure was tied implicitly to Anthropic's valuation in the Ars article [0] where I think I originally posted my comment. Those comments were moved here, so I've linked below.
Specifically linking the settlement sum to the valuation of a corporation is what caught me in a loop - that valuation assumes that Anthropic will do certain things in the future. I was thinking too much, maybe, about things like:
"Would a teenager get the same treatment? What about a teenager with a private company? What about a teenager who seemed dumber than that teenager to the person deciding their company's valuation? What about a teenager who had not opened the files themselves, but had spun up a model from them? What about a teenager who had done both?"
Etc. I think I was getting fixated on the idea that the valuation assumes future performance, and downloading the files was possibly necessary for that performance, but I was missing the obvious answers to some of my questions because of that.
I do think that some of the more anthropomorphising language - "training data" is an example - trips people up a lot in the same way. And I think that if the settlement sum reflects anything to do with the valuation of that corporation, that does create some interesting questions, but maybe not contradictions.
[0] https://arstechnica.com/tech-policy/2025/09/judge-anthropics...
Sometimes these companies specifically seek out a settlement to avoid setting a legal precedent in case they feel like they will lose.
This settlement was the "AI-friendly" thing.
I believe you are probably only looking at the current state of the world and seeing how it "stifles competition" or "hampers innovation". Those allegations are probably true to some extent, especially in specific cases, but its also missing the fact that without those protections, the tech likely wouldn't be created in the first place (and so you still wouldn't be able to freely use the idea, since the person who invented it wouldn't have).
this is a kinda strange example, since the discovery tends to be government funded research, and the safety shown by private money
the USSR went to space without those protections. its not like property protections are the only thing that has driven invention.
MIT licenses are also pretty popular as are creative commons licenses.
people also do things that don't make a lot of money, like teaching elementary school. it costs a ton of money to make and run all those schools, but without any intellectual property being created that can be sold or rented out.
i dont believe that nobody would want to build much of the things we have now, if there wasnt IP around them. Making and inventing things is fun
People write fanfiction without being paid, however, Avatar 2 cost hundreds of millions to produce [1]. The studio didn't spend this money for the heck of it, they spent this money with the hope of recouping their investment.
If no one can make money off of intellectual property, people will continue writing fanfiction. But why would a studio spend hundreds of millions making a blockbuster movie?
[1] https://variety.com/2022/film/news/avatar-2-budget-expensive...
I wonder if the world would be a better place if we had fewer financial incentives to do things, in general?
> But why would a studio spend hundreds of millions making a blockbuster movie?
Under this hypothetical scenario, I believe there wouldn't be a "studio" in the first place. There could be a group of people who want to express themselves, get famous or do something just for fun, without any direct financial gain. Sure, they wouldn't be able to pull off Avatar 2, but our expectations as consumers would also be different.
The opposite idea is intrinsic motivation, and that artists make art because they love it, and they were going to make the art (or come up with ideas) anyway, even if you didn't pay them. But artists also love having comfortable lifestyles, maybe families, maybe expensive studio equipment, maybe parties. And although you can't force them to care about your project you can certainly bribe them into seeing if they are interested. So you can bring out the ideas that they were supposedly going to have anyway - but might not have been able to have without funding - and you can steer the emphasis of their pre-existing interests around.
Which is to say that creativity and money interact in a weird way, where ideas don't have a cost, but creative focus does.
This sounds trivially true but I have some trouble reconciling it with reality. For example the Llama models probably cost more than this to develop but are made freely available on GitHub. So while it’s true that some things won’t be built, I think it’s also the case that many things would still be built.
As a society we’re having trouble defining abstract components of the self (consciousness, intelligence, identity) as is. What makes the legislative notion of an idea and its reification (what’s actually protected under copyright laws) secure from this same scrutiny? Then patent rights. And what do you think may happen if the viability of said economy comes into question afterwards?
Author's could potentially get a couple months of sales by working with manufacturers themselves and being the first to sell their books. But as soon as untrusted parties can get their hands on the book, someone will start selling their own copies of it.
But if it were legal to distribute copies, these websites wouldn't need to operate in the shadows, switching domain names constantly to evade law enforcement. Instead, these websites could become as easy-to-use as Steam, but instead of paying the creators of the games they could just take 100% of the revenues for themself.
There would be an explosion in what we would call "piracy" today, but what would just be called downloading games if copyright were scrapped, because the barrier to entry for doing so could be made so much lower.
I am not a fan of intellectual property and copyright enforcement (at least the weaponisation of them). But scrapping IP and copyright entirely would be disastrous. I prefer the idea of reducing the amount of time someone can hold IP/copyright for, or additional punishment for patent trolls, or other measures to alleviate the concerns of IP/copyright without destroying R&D and digital work.
The only problem the judge found here was training on pirated texts.
Since the violation is detected via model output, it doesn't matter what the input method is.
I think the Aereo case, and Scalia's dissent, are super relevant here. It's when the court decided to go with vibes, instead of facts. The inevitable result of that (which Scalia didn't predict) was selective enforcement.
edit: so what I really mean is that I bet you could get a court to say whatever you wanted about it if you were far wealthier and more influential than your opponents.
No it wouldn't. Making the machine is not making a copy of the book. Using the machine to make a copy of the book would be infringment because...you would be making a copy of the book.
> Judge William Alsup at the hearing said the motion to approve the deal was denied without prejudice, but in a minute order after the hearing said approval is postponed pending submission of further clarifying information.
> Alsup said class members “get the shaft” in many class actions once the monetary relief is established and attorneys stop caring. He told the parties that “very good notice” must be given to class members to ensure they have the opportunity to opt in or out, and protect Anthropic from potential claimants coming out of the woodwork later.
Essentially he has concerns about missing details in two directions:
1. How class members are going to get notified, submit claims, and paid out, what works are even included, and the involvement of an army of lawyers that shouldn't be paid from the settlement.
2. How this deal is going to prevent Anthropic getting sued for cases that should have been covered.
They are settling because the risk of losing will cost their entire business.
Anthropic knows that they will lose if they were brought to trial.
> Alsup gave the parties a Sept. 15 deadline to submit a final list of works, which currently stands around 465,000.
> That's a far cry from the 7 million works that he initially certified as covered in the class. A breakdown from the Authors Guild—which consulted on the case and is part of a Working Group helping to allocate claims of $3,000 per work to authors and publishers—explained that "after accounting for the many duplicates," foreign editions, unregistered works, and books missing class criteria, "only approximately 500,000 titles meet the definition required to be part of the class."
https://arstechnica.com/tech-policy/2025/09/judge-anthropics...
But sure, I bet us randos on HN have a better feel for this than Anthropic's legal team.
(I'm not saying selling llm access means selling copies of the book -- but then I'm also not not saying it.)
AFAIK it's even worse as this settlement is only about downloading pirated copies of the books. IIRC the training itself was deemed fair use.
But it means that this case is getting them thousands of dollars instead of one or two purchases, which is a pretty good outcome.
I'm wondering how lost would the book be? What would be the difference in sales.
Same with code, AI hoovering all the code doesn't mean people won't use libCurl, but it does mean jobs are disappearing and people may not be around to write the next libCurl.
You're the second person I've talked to in this thread who thinks "The law is not applied with perfect consistency in practice!? Dear god why is it not like computer programs?".
It's just my personal take, but I don't think extreme rigidity and consistency in a "code is law" fashion would ever be desirable. look over to the crypto world to see how DAOs are faring.