I think this is the fundamental problem of LLMs in general. Some of the time looks just enough right to seem legitimate. Luckily the rest of the time it doesn’t.
But all of it’s responses definitely seem convincing (as it has been trained to do)
I feel like I'm watching a tsunami about to hit while literally already drowning from a different tsunami.
Everything looks right but misses the underlying details that actually matter.
There is a larger problem that I think we like to pretend that everything is so simple you don't need expertise. This is especially bad in our CS communities where there's a tendency of thinking intelligence in one domain cleanly transfers to others. In this respect I generally advise people not to first ask LLMs what they don't know but what they are experts in. That way they can properly evaluate their responses. Least we all fall for Murry Gelmann amnesia lol
Any large enough organization gathers them en mass to cloud real development work with "compliance."
But of course producing fake ones is far easier and cheaper.
As I just commented in the other AI trust thread on the front page, this dynamic is funnily enough what any woman using online dating services has always been very familiar with. With the exact same tragedy of the commons that results. Except for the important difference that terrible profiles and intro messages have traditionally usually been very short and easily red-flagged. But that is, of course, now also changing or already changed due to LLMs.
(Someone I follow on a certain social media platform just remarked that she got no less than fifty messages within a single hour of marking herself as "single". And she's just some average person, not a "star" of any sort.)
Referral systems are very efficient at filtering noise.
There is also the possibility that in trying to get someone to refer me I give enough details that the trusted person can submit instead of me and claim credit.
> When you're volunteering out of love in a market society, you're setting yourself up to be exploited.
I sound like a broken record but there's unifying causes to most issues I observe in the world.
None of the proposed solutions address the cause (and they can't of course): public scrutiny doesn't do anything if account creation is zero-effort; monetary penalization will kill the submissions entirely.
In a perfect world OSS maintainers would get paid properly. But, we've been doing this since the 90s, and all that's happened is OSS got deployed by private companies, concentrating the wealth and the economic benefits. When every hour is paid labour, you pick the AWS Kafka over spinning up your own cluster, or you run Linux in the cloud instead of your own metal. This will always keep happening so long as the incentives are what they are and survival hinges on capital. That people still put in their free time speaks to the beautiful nature of humans, but it's in spite of the current systems.
- Primarily relies on a single piece of evidence from the curl project, and expands it into multiple paragraphs
- "But here's the gut punch:", "You're not building ... You're addressing ...", "This is the fundamental problem:" and so many other instances of Linkedin-esque writing.
- The listicle under "What Might Actually Work"
In point of fact, I had not.
After the security reporting issue, the next problem on the list is "trust in other people's writing".
This has additional layers to it as well. For example, I actively avoid using em dash or anything that resembles it right now. If I had no exposure to the drama around AI, I wouldn't even be thinking about this. I am constraining my writing simply to avoid the implication.
I'm still using bullet lists sometimes, as they have their place, and I'm hoping LLMs don't totally nuke them.
You don't know whose style the LLM would pick for that particular prompt and project. You might end up with Carmack or maybe that buggy, test-failing piece of junk project on Github.
There's no "LLM style". There's "human style mimicked by LLMs". If they default to a specific style, then that's on the human user who chooses to go with it, or, likely, doesn't care. They could just as well make it output text in the style of Shakespeare or a pirate, eschew emojis and bulleted lists, etc.
If you're finding yourself influenced by LLMs—don't be. Here's why:
• It doesn't matter.
• Keep whatever style you had before LLMs.
:tada:
There is a "default LLM style", which is why I call it that. Or technically, one per LLM, but they seem to have converged pretty hard since they're all convergently evolving in the same environment.
It's trivial to prompt it out of that style. Word about how to do it and that you should do it has gotten around in the academic world where the incentives to not be caught are high. So I don't call it "the LLM style". But if you don't prompt for anything in particular, yes, there is a very very strong "default LLM style".
https://news.ycombinator.com/item?id=44072922
It's sad because people that are ok with AI art are still enjoying the human art just the same. Somehow their visceral hate of AI-art managed to ruin human art for themselves as well.
But instead we had a 'non-profit' called 'Open'AI that irresponsibly unleashed this technology on the world and lied about its capabilities with no care of how it would affect the average person.
AI outputs mimicking art rob audiences of the ability to appreciate art on its own in the wild without further markers of authenticity, which steals joy from a whole generation of digital artists that have grown up sharing their creativity with each other
If you lack the empathy to understand why AI art-like outputs are abhorrent, I hope someone wastes a significant portion of your near future with generated meaningless material presented to you as something that is valuable and was time consuming to make, and you gain nothing from it, so that you can understand the problem for yourself first hand.
HN discussed it here https://news.ycombinator.com/item?id=44384610
The responses were a surprisingly mixed bag. What I thought was a very common sense observation had some heavy detractors in those threads.
It's better to stay neutral and say you suspect it may be AI generated.
And for everyone else, responsible disclosure of using AI tools to write stuff would be appreciated.
(this comment did not involve AI. I don't know how to write an emdash)
Literally the first two sentences on the linked article:
> Disclosure: Certain sections of this content were grammatically refined/updated using AI assistance, as English is not my first language. Quite ironic, I know, given the subject being discussed.
Personally, I've read enough AI generated SEO spam that anything with the AI voice comes off as being inauthentic and spammy. I would much rather read something with the mistakes a non-native English speaker would make than something AI written/edited.
> Certain sections of this content were grammatically refined/updated using AI assistance
> I don't know how to write an emdash
Same here, and at this point I don’t think I will ever learn
Between this and the flip side of AI-slop it's getting really frustrating out here online.
Disclosure: Certain sections of this content were grammatically refined/updated using AI assistanceToday there are scams that look just like real companies trying to get you to buy from them instead. Who knows what happens if you put your money down. (Scams were of course always a problem, but there is much less cost to create a scam)
It's good for the site collecting the fee, it's good for the projects being reported on and it doesn't negatively affect valid reports.
It does exactly what we want by disincentivizing bad reports, either AI generated or not.
What do other countries do for their stuff like this?
If this isn't already a requirement, I'm not sure I understand what even non-AI-generated reports look like. Isn't the bare-minimum of CVE reporting a minimally reproducible example? Like, even if you find some function, that for example doesn't do bounds-checking on some array, you can trivially write some unit testing code that's able to break it.
You sort of want to reject them all, but ocassionally a gem gets submitted which makes you reluctant.
For example, years ago i was responsible for triaging bug bounty reports at a SaaS company i worked at at the time. One of the most interesting reports was that someone found a way to bypass our oauth thing by using a bug in safari that allowed them to bypass most oauth forms. The report was barely understandable written in broken english. The impression i got was they tried to send it to apple but apple ignored them. We ended up rewriting the report and submitting it to apple on there behalf (we made sure the reporter got all credit).
If we ignored poorly written reports we would have missed that. Is it worth it though? I dont know.
So safari was not following the web browser specs in a way that compromised oauth in a common mode of implementation.
Regex exploitation is the forever example to bring up here, as it's generally the main reason that "autofail the CI system the moment an auditing command fails" doesn't work on certain codebases. The reason this happens is because it's trivial to make a string that can waste significant resources to try and do a regex match against, and the moment you have a function that accepts a user-supplied regex pattern, that's suddenly an exploit... which gets a CVE. A lot of projects then have CVEs filed against them because internal functions rely on Regex calls as arguments, even if they're in code the user is flat-out never going to be able interact with (ie. Several dozen layers deep in framework soup there's a regex call somewhere, in a way the user won't be able to access unless a developer several layers up starts breaking the framework they're using in really weird ways on purpose).
The CVE system is just completely broken and barely serves as an indicator of much of anything really. The approval system from what I can tell favors acceptance over rejection, since the people reviewing the initial CVE filing aren't the same people that actively investigate if the CVE is bogus or not and the incentive for the CVE system is literally to encourage companies to give a shit about software security (at the same time, this fact is also often exploited to create beg bounties). CVEs have been filed against software for what amounts to "a computer allows a user to do things on it" even before AI slop made everything worse; the system was questionable in quality 7 years ago at the very least, and is even worse these days.
The only indicator it really gives is that a real security exploit can feel more legitimate if it gets a CVE assigned to it.
> A security report lands in your inbox. It claims there's a buffer overflow in a specific function. The report is well-formatted, includes CVE-style nomenclature, and uses appropriate technical language.
Given how easy it is to generate a POC these days, I wonder if HackerOne needs to be pivoting hard into scaffolding to help bug hunters prove their vulns.
- Claude skills/MCP for OSS projects
- Attested logging/monitoring for API investigations (eg hosted BURP)
I know that this poses new problems (some people can't afford to spend this money), but it would be better than just wasting people's time.
Would this be different if the underlying code had a viral license? If google's infrastructure was built on a GPL'ed libcurl [0], would they have investment in the code/a team with resources to evaluate security reports (slop or otherwise)? Ditto for libxml.
Does GPL help the linux kernel get investment from it's corporate users?
[0] Perhaps an impossible hypothetical. Would google have skipped over the imaginary GPL'ed libcurl or libxml for a more permissively licensed library? And even if they didn't, would a big company's involvement in an openly developed ecosystem create asymmetric funding/goals, a la XMPP or Nix?
> Does GPL help the linux kernel get investment from it's corporate users?
GPL has helped "linux kernel the project" greatly, but companies invest in it out of their self-interest. They want to benefit from upstream improvements and playing nicely by upstreaming changes is just much cheaper than maintaining own kernel fork.
On other side you have companies like Sony that used BSD OS code for their game consoles for decades and contributed shit.
So... Two unrelated things.
> I would have thought supporting libcurl and libxml would also be in a company's self-interest.
Unfortunately majority of companies don't have something special they really need to add to cURL. They okay using it as is - so they have no reason to pay salary to cURL developers regardless of licensing.Yes they want it to be secure, but as always nobody except few very large orgs care about security for real.
> Is that companies do this for GPL'ed linux kernel but not BSD evidence that strong copyleft licensing limits the extent to which OSS projects are exploited/under-resourced?
It certainly helped with "under-resourced" part. Whatever you considered "exploited" is up to discussion. From project perspective ofc copyleft licensing benefited the project.Linus Torvalds end up with a good amount of publicity and is now somewhat well set-off, but almost all other kernel developers live in obscurity earning somewhat average salaries. I pretty sure we can all agree that Linux Kernel made a massive positive impact on whole humanity and compared to that payoff to stakeholders is rather small IMO.
Most people's initial contributions are going to be more concrete exploits.
Different models perform differently when it comes to catching/fixing security vulnerabilities.
Even more so when there is a bounty payout.
Refundable if the PR/report is accepted.
I’m not saying that AI hasn’t already given us useful things, but this is a symptom of one very negative change that’s drowned a lot of the positive out for many people: the competence gap used to be an automatic barrier for many things. For example, to get hired as a freelance developer, you had to have at least cargo-culted something together once or twice, and even if you were way overconfident in your capability, you probably knew you weren’t the real thing. However, the AI tools industry essentially markets free competence, and out of context for any given topic, that little disclaimer is meaningless. It’s essentially given people climbing the Dunning-Krueger Mt. Stupid the agency to produce garbage at a damaging volume that’s too plausible looking for them (or laypeople, for that matter) to realize it’s garbage. I also think somewhat nihilist people prone to get-rich-quick schemes (e.g. drop-shipping, NFTs) play these workflows like lottery tickets while remaining deliberately ignorant of their dubious value.
This is such an important problem to solve, and it feels soluble. Perhaps a layer with heavily biased weights, trained on carefully curated definitional data. If we could train in a sense of truth - even a small one - many of the hallucinatory patterns disappear.
Hats off to the curl maintainers. You are the xkcd jenga block at the base.
Even if Problems feel soluble, they often aren't. You might have to invent an entirely new paradigm of text generation to solve the hallucination problem. Or it could be the Collatz Conjecture of LLMs, that it "feels" so possible, but you never really get there.
- dictionary definitions - stable apis for specific versions of software - mathematical proofs - anything else that is true by definition rather than evidence-based
(i realize that some of these are not actually as stable over time as they might seem, but they ought to do good enough with the pace that we train new models at).
If you even just had an MOE component whose only job was verifying validity against this dataset in chain-of-thought I bet you'd get some mileage out of it.
How ironic, considering every time I've reported a complicated issue to a program on HackerOne, the triggers have completely rejected them because they do not understand the complicated codebase that they are triaging for.
Also the curl examples given in TFA completely ignore recent developments, where curl's maintainers welcomed and fixed literally hundred of AI-found bugs: https://www.theregister.com/2025/10/02/curl_project_swamped_...
Welcome to the Internet.
> The downside is that it makes it harder for new researchers to enter the field, and it risks creating an insider club.
I also think this concern can be largely mitigated or reduced to a nonissue. New researchers would have a trust score of zero for example, but people who consistently submit AI slop will have a very low score and can be filtered out fairly easily.
It doesn't have to make the final judgement, just some sort of filter that automatically flags things like function calls that don't exist in the code.
Certain sections of this content were grammatically refined/updated using AI assistance, as English is not my first language.
OP: I sympathize, but I would much rather read your original text, with typos and grammatical errors. By feeding it through the LLM you fix issues that are not really important but remove your own voice and get a bland slop identical to 90% of these slopblogs (which your's isn't!)As much as I'd like to see Russia, China and India disconnected off of the wide Internet until they clean up shop with abusive actors, the Hacktoberfest stuff you're likely referring to doesn't have anything to do with your implication - that was just a chance at a free t-shirt [1] that caused all the noise.
In ye olde times, you'd need to take care how you behaved in public because pulling off a stunt like that could reasonably lead to your company going out of business - but even a "small" company like DO is too big to fail from FAFO, much less ultra large corporations like Google that just run on sheer moat. IMHO, that is where we have to start - break up the giants, maybe that's enough of a warning signal to also alert "smaller" large companies to behave like citizens again.
It's a cargo cult. Maybe the airplanes will land and bring the goodies!
fake games by fake studios played by fake players is still a thing