It makes research harder too, since more and more public information is infected by AI content. Both published posts and internet discussions are tainted.
And then the AI companies threaten to crash the whole economy if we don't let them do it.
Wouldn't this be the reason for not calling it transformative but simple copyright theft?
No, a reduction in traffic is not sufficient to conclude that a copyright violation has occurred. Sure, it might have. Alternatively it might have produced a lossy summary in which case the reduction in traffic raises some difficult questions about the value of the original work.
In other cases an LLM can synthesize a genuinely useful explanation of a subject that is precisely tailored to the needs of the asker. In those cases the machine output might well prove more useful to the asker than any single original reference would have.
For something like news where what you're paying for is timely delivery it makes sense to restrict automated (not just LLM) access for the first few days because a similarly timely summary will capture the majority of the value proposition of your service.
That's not typical though. For example, I'm certainly not going to be satisfied with a summary of the plot of a book I'm interested in. Would you want to watch a 10 minute highlights reel in place of a 2 hour feature length film?
You can apply obsolete legal tests that have been used to enforce this principle all day long, but the central question remains: Does generative AI encourage creation of original creative works?
If the answer is "no", which it clearly is, then whatever laws and legal tests exist to enforce IP rights need to be amended - or the constitution does.
Free reproduction of "original creative works" fuels original creation, too, while creating tight monopolies over intellectual works and universes has led to a decreased creativity around them.
See the dire state of the US film making industry, as an example. Or the vast amount of bizarre lawsuits such as the one for the "Bittersweet Symphony".
I'm already finding the ability of LLMs to synthesize useful descriptions across disparate sources of raw data to be immensely useful. If that puts (for example) scientific textbook authors out of a job I'm not at all sure that would prove to be a detriment to society on the whole. I'm fairly certain that LLMs are already doing better at meeting the needs of the reader than most of the predatory electronic textbook models I was exposed to in university.
> If the answer is "no", which it clearly is,
Why are you so certain of this? It clearly breaks many (most?) of the existing revenue models to at least some extent. But we don't care about the existing revenue models per se. What we care about is long term sustainable creation across society as a whole. So are consumer needs being met in a sustainable manner? Clearly generative AI is (ever increasingly) capable of the former; it's the latter that requires examination.
God, how I hate the billions and billions (sorry Carl Sagan) of pages that look like they're going to have information, but just repeat your question and expand it to 20 or more paragraphs.
"You want to know about how much torque to tighten wheel nuts for a Ford Focus. Torque to tighten wheel nuts is an important aspect in securing the wheels after replacing them...". God damn, it's Eliza being abused to serve the attention economy, and you notice this and wonder if the page would have the correct information or it would just be another bucket of shit pretending to be information... So you think "Let's skip all of that and ask AI...".
This sort of thing was a problem before too. I had a few copycats myself. The difference is that those copycats competed on the same level, and this sort of shallow copying was still too expensive. Google can automate plagiarism at a scale and just shut the competition down.
They are the modern Terminal Railroad Association.
https://en.wikipedia.org/wiki/United_States_v._Terminal_Rail...
I have friends that made a descent buck 20-30 years ago translating technical documents like car manuals. Over the years, prices fell from quarters per words to fractions of a cent.
And even though machine translation was barely existent, tools were used to argue higher productivity and therefore lower prices.
Several of my friends who used to work full-time as translators are now supplementing their income with side jobs like foreign language teaching, proofreading, and similar work.
To be clear, we had bad subtitles before as well, but that was due to the translators lack of cultural understanding. Most of it made sense at least. Now it's just straight up bad and often makes no sense. Sad!
AI does a much better job of translating than the stuff I see on TV.
I get the impression that it's done by a lowly paid person who uses a computer dictionary to translate word by word, in a very rushed manner.
Now it's a classic, you need an expert in order to check the work of the machine, because the "customer" is by definition not able to do it.
Aside from highly technical domain, in purely literary works, I think that the translator is a co-author - maybe IP laws acknowledges that already? I remember the translation of E.A. Poe by C. Baudelaire for instance; I think you could feel Baudelaire's style because it is a lot "warmer" than Poe's. I've also read a translation of a Japanese novel and I was quite disappointed with it. I don't know Japanese but I have read/watched quite a few mangas/animes, so I could sense the speech patterns behind the translations and sometimes thought they could have made better choices.
In any case, one will still need a translator who is good at "prompt engineering" to get a quality translation. I don't know. Maybe translators can add this skill to their CV, so they can propose quick-and-dirty/cheap translations, or no-AI high quality translations.
Some suggest "no-AI" labels on cultural products already - I think if it becomes a reality it will probably act as "quality signaling", because it is becoming more difficult every year to tell the difference between AI and human productions. It won't matter if what you read was written by an AI or a human (if it quacks and looks like a duck...), but what the customer will probably want is to avoid poorly-prompted machine translation.
Note that this only applies to something like a translation where there's some notion of a "correct answer". For other cultural products it's irrelevant (as you say, if it quacks like a duck ...).
Quality signaling is really only necessary in situations where an upfront investment is required and any deception is only revealed sometime later upon use. Safety critical systems such as airbags are a model example of this - a counterfeit of deficient functionality won't be discovered until it deploys, which in most cases will never happen.
That said, while I certainly can't speak to business or diplomatic translations, when it comes to cultural works (ie entertainment) the appeal of machine translation to me has been gradually increasing over time as it gets better. I don't generally find localization desirable and in some cases it even leads to significant confusion when a change somehow munges important details or references. Confusion which I'm generally able to trivially resolve by referencing machine output.
These translations are not perfect, yet. But good enough for my needs. Any professional translator services would in any case be beyond our budget. The advantage of using agentic coding tools here (enabled by using a site generator rather than a CMS) is that I can get systematic about dealing with jargon, SEO, and frequently used phrases. I simply document all that and instruct the tool to refer to that. The funny thing is that most of the models are pretty good at fixing their own mistakes if you just ask them too. I asked it to look for examples of "denglish" (German English) in its own German translations and then to fix it. It found a few examples and the suggested fixes were fine.
A lot of people are focusing on the negative here. I like to look at the positive. We're approaching the moment where any person on this planet will be able to communicate directly with any other person on this planet without the need for translators. The tools already exist for this. But they need a lot of work on quality.
A second point here is that the role of English as the most popular intermediary language is disappearing as well. I'm not a native speaker. When I talk to foreigners from wherever, it's mostly in (bad) English. By definition that limits me to talking to people that have had enough education and exposure to English. This is very limiting. A lot of the people we need to talk to here in Germany aren't all that comfortable speaking English.
The negative: people are about to lose their jobs.
The positive: AI billionaires become trillionaires.
Why focus on the negative indeed!
People talk about business as though only the owners of the business benefit. Everybody else pays the price. But aren't the main beneficiaries all the people using these services?
As a German speaker, I experience the quality of German language technical documentation steadily declining. 30 years ago, German documentation was usually top notch. With the first machine translations, quality went notably down. Now, with LLM translation, it's often garbage with phrases of obvious nonsense in it.
This is especially true with large companies like IBM, Microsoft or Oracle.
I guess the situation is better for languages where translations only became available with LLM.
This is explicitly not a benefit to the people using the services.
Where does it say that? I have never read any Book of Genesis that says "God destroyed the tower".
Also it doesn't say "God was afraid". God doesn't have negative emotions like that. God plans out everything, so He is not "afraid" in the human sense.
In fact, I am fairly certain that the mythical "Tower" for the Jews was sort of a parody of the Pyramids of Egypt and the Ziggurats of Mesopotamia. They were essentially mocking their ancient neighbors in the Levant for such a frivolous project that they believed really didn't honor God, but increased their arrogance and hubris.
In fact, the Sumerians worshipped a god named "Sin" https://en.wikipedia.org/wiki/Sin_(mythology) and it is believed that the "plain of Shinar" and the "wilderness of Sin" are cognate with this term, and therefore represents the ancient deity that was worshipped in that particular case.
For Egypt, the pyramids were funerary monuments, i.e. they invariably honored some dead Pharaoh. The Jews invested their engineering progress in building a temple of the Living God instead.
So it stands to reason, in the Hebrews' account of the foreign projects, that immigrants would come in, mess up the project enough, and they would kinda abandon them in progress. But they weren't destroyed.
"Meanwhile, the poor Babel fish, by effectively removing all barriers to communication between different races and cultures, has caused more and bloodier wars than anything else in the history of creation."
For a indie videogame i work on, we tried a couple translation agencies, and they gave terrible output. At the end, we built our own LLM based agentic translation, with lots of customization for our specific project like building a prompt based on where the menu/string is at, shared glossary, and other features. Testing this against the agencies, it was better because we could customize it for the needs of our specific game.
Even then, at the end of the day, we went with freelancers for some of the languages as we couldn't really validate the AI output on those languages.The freelancers took a month to do the translation vs the 2-3 days we ourselves took for the languages we knew and we could monitor the AI output. But they did a nice job, much better than the agencies.
I feel that what AI really completely kills is those translation services. Its not hard at all to build or customize your own AI system, so if the agency is going to charge you considerable money for AI output, just do it yourself and get a better result. Meanwhile those freelancers are still in demand as they can actually check the project and understand it for a nice translation, unlike the mechanical agencies where you send them the excel and they send it to who knows what or an AI without you being able to check.
I will likely be opensourcing this customizable AI translation system for my project soon.
I believe that the current generation of GenAI (as a market, not necessarily as a tech) is going to crash and burn by 2027. I also believe that open-source will stay and will keep helping people and that, as the world becomes more lawless and free-for-all, we'll need all the help we can find.
And I don't think most of those came from idealistic people without any vested interest in AI business.
On top of that the open source market will increasingly be flooded with (well intended) AI slop built by junior devs.
The rest is closed source.
I had a report/article (using Claude) on methods to fall asleep. First method worked amazingly and I am doing that every night. I translated it to Urdu using Gemini (because they are doing translations for a long time, they must be better). First translation used very stiff/dry language, something you might read in a research paper etc. I asked it to simplify it for my mom the language and to my astonishment, it did that very very well.
Urdu surely must not have as much data as other languages. I could not find a single mistake. Yeah there were some weird choices of words in original version, but the simpler was so good it can be published like an article.
One can dream.
She worked freelance 40 years from age 25 to age 65.
In the 15 years that preceded her retirement, she would get less and less work.
Partly she's an introvert who relies on her network to provide work, and her network gradually retired.
But machine translation was the big killer.
Before LLMs, the early versions of Google Translate killed paid translation.
As the market adapted to machine translation, and as the internet became a globalised platform for knowledge work, it also opened up to lower grade translation, and there were suddenly many more translators willing to work at a lower wage.
Prior to Google Translate there were semi-automated systems that would fuzzy match from large databases.
But it'd still rely on a human-in-the-loop for adapting the sentences.
With Google Translate you'd get a super sketchy translation out, very crude and not at all correct or idiomatic in the target language. Any distance between the source and target language (e.g. English -> Chinese) and it'd be one big joke. With plain Google Translate it still is. But the market spoke: Probably you don't need a very good translation most of the time. Especially not if the shitty one is free.
In her later years she moved to transcription of board meetings. She'd type up everything that was said.
I work for a company now that automates transcription via the whisper model and generates summaries that can be adapted by the customer. You pay per minute of transcription, and you can regenerate summaries as much as you want after that until your prompts give the right results.
All of this manual labor that provided for my childhood is gone now.
I couldn't imagine being a professional translator today and not use AI extensively.
But unless I have a legal reason to consult with a professional translator, I probably don't even need one, since LLM-based translation is as good as it gets with just plain LLM usage, and near perfect with automated translation tools that will help you pick both the mood, formality and alternative formulations for your translation.
High-grade translation is massively parallelisable, and a human-in-the-loop is entirely for final proof-reading.
Was that a double entendre or not? If not, you might make a literal translation to get the meaning across. If so, then a literal translation will not get the message across. Vice versa, if it was not a double entendre but you translate it as one, you may confuse the message and if it was and you translate it as such, then the human connection can be maintained.
That is also the tricky bit where you cross from being proficient in the language (say B1-B2) to fluent (C1-C2), you start knowing these double meanings and nuance and can pick up on them. You can also pick up on them when they weren't intended and make a rejoinder (that may flop or land depending on your own skill).
If you are constantly translating with a machine, you won't really learn the language. You have to step away at some point. AI translations present that in full: a translated text with a removed voice; the voice of AI is all of us and that sounds like none of us.
And as we all know legal language is famous for having no nuance whatsoever, there are no opaque technical terms with hundreds of years of history behind their usage, there is no difference between the legal systems of different countries, and there is no possible difference in case law or the practicalities of legal enforcement. /sarcasm
What is clear to me that in a situation like this neither AI translation nor human translation is sufficient. What the imagined American signing an important legal document in the Czech Republic needs is a lawyer practicing in the Czech Republic who speaks a language the imagined American also speaks.
As someone who has been in that situation before, ‘literal’ translation is not actually a thing. Words and phrases have different meanings between legal systems.
You need a certified translation from someone who is familiar with both legal systems or you’re going to have a very bad time.
Which I think you know from the second part of your statement.
Legal documents likely have much more impact than a random chat with a stranger.
If it was a customized contract then I'd want to use a local legal professional who could also speak English.
Pacta sunt servanda can be a real bitch sometimes.
Also one of my Aunties did live court translations in Washington, DC, and those will likely still require living humans for the forseable future