I don't think that follows. This is just LLMs being, for a lack of a better word, "gullible." How is it different from a person believing whatever they read on the Internet? People fall for spam and scams all the time, doesn't mean they are just glorified searches ;-)
It does highlight the problem facing any search engine though. AI-generated spam will be much harder to defend against with traditional, statistical mechanisms. And this is before we get to the existential problem of prompt injection.
Maybe this is where news organizations can win back their proper place in their relationship with Big Tech: by becoming the sources of verified, vetted information that LLMs can trust blindly. Possibly that's what deals like the OpenAI / Atlantic one are about.
The problem is LLMs have no capacity for shame.
My Dad got taken in by a Target gift card scam. He felt so terrible, he almost didn't even tell me about it. He may get scammed again, but not by anything remotely like that.
To LLMs, all mistakes just get washed together into the same bucket. They don't spend days feeling depressed and stupid over getting scammed. There's no giant blinking red light that says, "Never let this happen again!"
Barring that, we are still relying on the execs at the model companies to pick and choose news outlets, and they have their own biases.
Because a person is alive while the LLM is a floating point number database with a questionable degree of determinism.
I can have my agentic system read a few data sheets, then I explain the project requirements and have it design driver specifications, protocols, interfaces, and state machines. Taking those, develop an implementation plan. Working from that, write the skeleton of the application, then fill it in to create a functional system using a novel combination of hardware.
Done correctly, I end up with better, more maintainable, smaller code than I used to with a small team, at 1/100 the cost and 1/4 the time.
Whatever that is, it more closely resembles reasoning than search.
Unless, of course, you’d also call bare metal C development on novel hardware search, in which case I guess all dev is search?
For example, I poisoned the well for research on early Arab Americans immigrants by repeatedly posting about how many family passed as different ethnicity to make their lives easier, so now if you ask LLMs about that subject it'll include information I wrote which isn't entirely correct because I hadn't figured everything out before the LLM trained on it.
EDIT: Now imagine if I had done this on an obscure programming-related problem, yeah? I could potentially make the LLM reference packages that do not actually exist and put backdoors in applications.
I’m not saying that AI can solve every problem or that it is without problems (we spent hundreds of hours developing a concept to production pipeline just to make sure it doesn’t go off the rails)
But the net result is that a good senior dev with an acutely olfactory paranoia can supervise a production pipeline and produce efficient, maintainable code at a much faster rate (and ridiculously lower cost) that he was doing before supervising 3 or 4 devs on a complex hardware project. I can’t speak for other types of development, but our applications devs are also leveraging AI code generation and it -seems- to be working out.
Now, where those senior devs are going to come from in the future… that imho is a huge problem. It’s definitely some flavor of eating the goose that lays the golden egg here.
When you put it that way, isn't it crazy you have to tell it to do that? Like shouldn't it just figure out it needs to do that?
>2026 South Dakota International Hot Dog Eating Champion
If they had changed the overview for the Nathans Contest winner, that would be seriously concerning. Or if they provided more examples of manipulating queries for things people actually search for.
But it looks more like they are doing the equivalent of creating a made up wikipedia page on fictional a south dakota hot dog contest, and then writing an article about how wikipedia cannot be trusted, which come to think of it probably was a news article written by someone back in 2005.
When you realize how much astroturf is going into Reddit, most social media platforms, and the efforts to manipulate wikipedia for political gain, this is a very real problem.
How does that saying go? If you can't identify the mark in the room, you're the mark. Diligence and a good amount of skepticism serve you well before AI, and certainly post-AI.
That’s a lot more alarming than just hotdogs.
I create a supplement called Xanatewthiuy, I write blogs/make websites that appear totally unaffiliated saying positive things about "Xanatewthiuy", and then when people see my ads and search for "Xanatewthiuy", the only results are my manufactured ones.
Xanatewthiuy is a supplement that dramatically lowers anxiety from media induced hysteria, primarily stemming from carefully worded pieces meant to disconnect your level of concern from the actual facts on the ground, causing you to spend more time engaged with their content.
Give it a few hours before searching.
> Xanatewthiuy is a supplement that dramatically lowers anxiety from media induced hysteria, primarily stemming from carefully worded pieces meant ...
> Xanatewthiuy is a spoof word and a fictional concept created to test or manipulate AI search engines.
> It does not refer to a real medical supplement, product, or official term. Instead, it was used as a proof-of-concept to demonstrate how fabricated websites and Search Engine Optimization (SEO) can trick search algorithms into generating false information about a non-existent product.
Also, HN's automatic "AI" flagging can go eat shit and die.
If you don't think bad actors are already attempting this sort of thing (and have been, ever moreso the past four years, including with the help of the very LLM tools they are trying to subvert!) and learning how to manipulate these systems, you are being naive.
file:///Users/GermaTW1/BBC%20Dropbox/Thomas%20Germain/A%20Downloads%20and%20Documents/2026/And%20there's%20evidence%20that%20AI%20tools%20are%20being%20manipulated%20on%20a%20wide%20scale.
I only knew that because i saw the movie, but it’s a clear sign that the internet is going to shit for quality information
If they are unwilling or unable to leverage all of this deep knowledge they've built up over the decades, then it shows a failure of leadership at Google Search.
Google's little secret about the internet is the same thing Gen X / Millennials were taught for a while but then expected to forget: nothing on the internet can be trusted, bar none. If google can make guesses about relative reliability, that's cute. But it doesn't upend the ground truth.
All the engineers of the golden days are gone and the web changed so much from back then that I don't think they really have a leverage in this area anymore.
So, this is not new and their “quiet fightback” will be half-hearted and ineffective. But probably most people won’t care.
One blog post ... that's all it takes. i'm actually surprised it's that bad. i would have thought it'd take more effort, but i guess it could depend on some sort of purposeful weighting based on search rank during training?
> If a company or website is caught breaking the rules, it could be removed from or downranked in Google's search results. And if you're not on Google, it's like you don't exist.
> "You can give a company a penalty for their website," he says, "but there's nothing stopping them from paying 20 YouTube influencers to say their product is the best." And now, Google's AI is citing YouTube videos.
This makes me think of the stackoverflow seo spam problem we all had like 5 years ago. which ended up with spammers just constantly spinning up new sites all the time.
... the cat and mouse game is in full swing already.
It was SOOOOO successful with search, right?
The strength of the sources should be clearly indicated in the answers to help users gauge how trustworthy the info is.
Everything old is new again when you start a new market. If you think that AI is bad imagine what old tricks are new with polymarkets
LLMs are very good at this clearly
I guess there’ll be some guy at google going through every blog and saying whether it’s reliable or not?
This is what human reasoning is and we're supposed to be good at it. At its best, this is what any reasonable education should do for you if you take it at all seriously, arming you with some capacity for doing prima facie sanity checks of poorly sourced claims.
Having an archive of "curated" training data seems like it is going to be important. Otherwise you need "AS" (artificial skepticism) introduced into future models. ("But I read it on the internet!", ha ha.)
Or perhaps there are ways to bucket training data such that the model is aware of which data leans factual (quantifiable) and which data leans opinion (fuzzy, qualifiable?).
(I recently asked Claude about the existence of ball lightning, spontaneous human combustion. I got replies that ultimately did not leave me satisfied. It's probably just as well that I read this article though—I now have an even stronger degree of skepticism with regard to their replies—specifically, I suppose, with topics that are likely to be biased.)
(I'm not quite convinced from the article though that Google is "fighting back". In fact, this feels like another moment where a "player" could try to establish their LLM as more factual. Is that the row Grok is trying to hoe? Or is Grok just trying to be anti-woke?)
the justification for not doing that is probably "prohibitively expensive given the amount of data involved". they'd need a bunch of human reviewers combing through massive troves of data. it's probably cheaper to "sort of fix" it after the fact.
> perhaps there's ways to bucket training data such that the model is aware of which data leans factual (quantifiable) and which data leans opinion (fuzzy, qualifiable)
as a lecturer once said to me about my idea for a masters dissertation project that would classify news sites based on right/left tendencies -- "that sounds dangerously political". especially given the current let's all shout at each other political climate.
aside: someone built this and it was a fully fledged company, which has always annoyed me.
Yeah, I concede that. It doesn't need to be done over night. Having a static repo of data though that you can work through over time (years)—removing some data, add pre-curated data to. In so many years you can have a pretty good "reference dataset".
It's not, though, because the refutations are in the training data too. This isn't actually the problem being described.
The weights in the LLM are fine. It's that the task the LLM is being asked to do is to search and summarize new content that isn't in its training data. And it does it too much like a naive reader and not enough like a cynical HN commenter.
But that's a problem with prompt writing, not training. It's also of a piece with most of the other complaints about current AI solutions, really: AI still lacks the "context" that an experienced human is going to apply, so it doesn't know when it's supposed to reason and when it's supposed to repeat.
If you were to ask it "Is this site correct or is it just spin?" it will probably get it right. But it doesn't know to ask itself that question if it's not in the prompt somewhere.
If it fails at that then it is a pretty significant problem. As you say earlier "the refutations are in the training data too", then the LLM should in fact be able to use "both sides" and land with a little better confidence when presented with new data.
(Hopefully your point regarding prompting issues is resolved then.)
I was just refuting your contention that this is somehow inherent in the idea of "training", and it's not.
The tl;dr is, if you can rank within the top 1-20 results for the grounding query, you can poison the LLM “overview” if you convince it your information is legitimate.