Could you simulate something be typed? Trivially. Could you simulate something be drafted? Honestly, even if you wanted to put in all that time and effort, I’m not even sure LLMs are sophisticated enough to send the logical drafts, loops and edits that would pass a writers sniff test
And as LLMs get better at producing human-like text, that same pre-LLM reading experience, which helps people tell the two apart, will become less and less common.
You have no idea how many false positives and how many false negatives you have in your judgement. It is indeed impossible to differentiate between badly written human text and somewhat good written llm text.
The default LLM style is pretty deterministic. "Not X, Not Y. Just Z", etc.
There are phrases and cadences which are very rare in human prose (<5%) but unusually common (+95% occurrences in LLM prose). It is not unreasonable to look at content which is 95% LLM tells and conclude that that an LLM authored it.
I have noted, IRL, that those people who read very little, and only read when they have to (work docs, etc) are literally unable to tell that a piece of prose sounds like an LLM even when it has about 12 occurrences of "Not X. Just Y" or "Not X, Not Y. Just Z" in as many paragraphs.
Edit: Also, I'm surprised images have gotten to the point where I have a hard time detecting AI in some cases, and they got there more quickly than prose. I really thought prose would be the first to fall. Video is still detectable. Music still detectable (by someone that enjoys music and pays attention to it). But, AI prose still outs itself pretty quickly.
I even did post-training on Gemma 4 (a small model which has very good prose for a model, especially a small model) to try to make it able to write more like me, purely as an experiment to learn how training LoRAs works, and using data that I know is ethically sourced. It still distinctly writes like an LLM, with a few of my annoying quirks baked in: Inappropriate use of ellipses, too many parentheticals, occasionally dismissive tone. But, it can't stop doing the LLM things, either, without becoming incoherent.
Innocent until proven guilty is for a court.
Outside of a court, people use all sorts of heuristics to determine authenticity and trustworthiness. Since so few humans ever wrote like the way LLMs default to, it's not unreasonable to refuse serious engagement with a party exhibiting this.
Aside: Every time someone claims they have always written like this, I ask for a link to their writing dated pre-2022, and send both to all free LLM chatbots with the question "did the same writer write both these pieces".
I have not yet gotten a "likely", or even a "remotely likely". It's all been "extremely unlikely".
iNoCent UnTiL ProvEn GuiLtY iS fOR a CouRt.
What a ridiculous thing to stand behind.
lelanthran murdered a person. We shouldn't tolerate them posting in this community.
Prove you didn't murder someone before posting anymore.
And all of those are a fools errand, except meeting people face to face. Meeting somebody IRL it takes just a few minutes to know if they are trustworthy or not.
If the business is severely serious, then you have to do like Genghis Khan and get fully drunk together. Anybody who refuses can never be fully trusted.
I agree, and that works for IRL only!
But what I am seeing online is a very large push for people to stop calling out obvious LLM prose. One can only guess at the motivations from people who are throwing tantrums that their LLM prose should be allowed, because it's their idea being discussed.
What I am not seeing is them acknowledging the extreme disrespected they are demonstrating for a community when they cannot even bother to type "their" idea.
IOW, if someone doesn't have the time to write it, then we should be making fun of them, shaming them and generally mocking them for losing the use of their brain in a public setting.
https://news.ycombinator.com/item?id=48667761
From their homepage:
> Detect AI-generated content with 99.98% accuracy.
Are they wrong?
Though I ran the numbers, and even with a 0.02% false positive rate, that works out to about 6000 students falsely accused every semester, per university.
> Though I ran the numbers, and even with a 0.02% false positive rate
They don't say that the false positive is 0.02%, only that the accuracy is 0.02%. All we know for certain is that the false positive and false negatives added together result in 0.02%.
I think concepts like this are the only reliable way to prove something was written by a human. A full replay like this is one way to do it. I think there are some other feasible ways to achieve this, maybe in combination with a full “replay”, but some sort of “proof of work” is the way to go I believe. As LLMs become more ubiquitous, I imagine products that solve the problem can be a real business opportunity.
Oh, wait.
https://images.ctfassets.net/kftzwdyauwt9/3bzFMXhknmq5TZvVL7...
(From https://openai.com/index/introducing-chatgpt-images-2-0/ )
(otoh it's also trivial to copy whatever ChatGPT wrote by hand without thinking about the assignment at all)
Except it lacks proof of the keyboard - and the meat.
This product is in bad taste, and I hope it doesn't succeed.