Other people apparently don't have this feeling at all. Maybe I shouldn't have been surprised by this, but I've definitely been caught off guard by it.
You on the other hand, have for many years honed your craft. The more you learn, the more you discover to learn aka , you realize how little you know. They don't have this. _At all_. They see this as a "free ticket to the front row" and when we politely push back (we should be way harsher in this, its the only language they understand) all they hear is "he doesn't like _me_." which is an escape.
You know how much work you ask of me, when you open a PR on my project, they don't. They will just see it as "why don't you let me join, since I have AI I should have the same skill as you".... unironically.
In other words, these "other people" that we talk about haven't worked a day in the field in their life, so they simply don't understand much of it, however they feel they understand everything of it.
ever had a client second guess you by replying you a screenshot from GPT?
ever asked anything in a public group only to have a complete moron replying you with a screenshot from GPT or - at least a bit of effor there - a copy/paste of the wall of text?
no, people have no shame. they have a need for a little bit of (borrowed) self importance and validation.
Which is why i applaud every code of conduct that has public ridicule as punishment for wasting everybody's time
So their boss may be naive, but not hilariously so - because that is, in fact, how the world works[1]! And as a boss, they probably have some understanding of it.
The thing they miss is that AI fundamentally[2] cannot provide this kind of "correct" output, and more importantly, that the "trillion dollar companies" not only don't guarantee that, they actually explicitly inform everyone everywhere, including in the UI, that the output may be incorrect.
So it's mostly failure to pay attention and realize they're dealing with an exception to the rule.
--
[0] - Actually hurt you, I'm ignoring all the fitness/healthy eating fads and "ultraprocessed food" bullshit.
[1] - On a related note, it's also something security people often don't get: real world security relies on being connected - via contracts and laws and institutions - to "men with guns". It's not perfect, but scales better.
[2] - Because LLMs are not databases, but - to a first-order approximation - little people on a chip!
Cybersecurity is also an exception here.
"men with guns" only work for cases where the criminal must be in the jurisdiction of the crime for the crime to have occurred.
If you rob a bank in London, you must be in London, and the British police can catch you. If you rob a bank somebody else, the British police doesn't care. If you hack a bank in London though, you may very well be in North Korea.
E.g.
"A random drunk guy on the subway suggested that this wouldn't be a problem if we were running the latest SOL server version" "Huh, I guess that's worth testing"
Consider: GP would've been much more correct if they said "It's just a person on a chip." Still wrong, but much less, in qualitative fashion, than they are now.
It's like a JPEG. Except instead of lossy compression on images that give you a pixel soup that only vaguely resembles the original if you're resource bound (and even modern SOTA models are when it comes to LLMs), instead you get stuff that looks more or less correct but just isn't.
An LLM chatbot is not like querying a database. Postgres doesn't have a human-like interface. Querying SQL is highly technical, when you get nonsensical results out of it (which is most often than not) you immediately suspect the JOIN you wrote or whatever. There's no "confident vibe" in results spat out by the DB engine.
Interacting with a chat bot is highly non-technical. The chat bot seems to many people like a highly competent person-like robot that knows everything, and it knows it with a high degree of confidence too.
So it makes sense to talk about "hallucinations", even though it's a flawed analogy.
I think the mistake people make when interacting with LLMs is similar to what they do when they read/watch the news: "well, they said so on the news, so it must be true."
But (as someone else described), GPTs and other current-day LLMs are probabilistic. But 99% of what they produce seems feasible enough.
Unless I have been reading very different science fiction I think it’s definitely not that.
I think it’s more the confidence and seeming plausibility of LLM answers
delicate feelers is like octopus arms
I raise an issue or PR after carefully reviewing someone else's open source code.
They ask Claude to answer me; neither them nor Claude understood the issue.
Well, at least it's their repo, they can do whatever.
The client in your example isn't a (presumably) professional developer, submitting code to a public repository, inviting the scrutiny of fellow professionals and potential future clients or employers.
They are sure they know better because they get a yes man doing their job for them.
I am not saying one has to lose their shame, but at best, understand it.
Too little or too much shame can lead to issue.
Problem is no one tells you what too little or too much actually is and there are many different situations where you need to figure it out on your own.
So I think sometimes people just get it wrong but ultimately everyone tries their best. Truly malicious shameless people are extremely rare in my experience.
For the topic at hand I think a lot of these “shameless” contributions come from kids
Basically teenagers. But it feels like the rebellious teenager phase lasts longer nowadays. Zero evidence besides vibes and anecdotes, but still.
Or maybe it's me that's getting old?
Just like pain is a good thing, it tells you and signals to remove your hand from the stove.
That has NEVER led to a positive result in the whole of human history, especially that the second group is much larger than the first.
Of course, the vast majority of OS work is the same cog-in-a-machine work, and with low effort AI assisted contributions, the non-hero-coding work becomes more prevalent than ever.
For those curious:
Just like with email spam I would expect that a big part of the issue is that it only takes a minority of shameless people to create a ton of contribution spam. Unlike email spam these people actually want their contributions to be tied to their personal reputation. Which in theory means that it should be easier to identify and isolate them.
Two immediate ones I can think of:
- The yellow hue/sepia tone of any image coming out of ChatGPT
- People responding to text by starting with "Good Question!" or inserting hard-to-memorize-or-type unicode symbols like → into text where they obviously wouldn't have used that and have no history of using it.
It's not necessarily maliciousness or laziness, it could simply be enthusiasm paired with lack of experience.
I can't imagine the level of laziness or entitlement required for a student (or any developer) to blame their tools so quickly without conducting a thorough investigation.
Memory leaks and issues with the memory allocator are months long process to pin on the JVM...
In the early days (bug parade times), the bugs are a lot more common, nowadays -- I'd say it'd be an extreme naivete to consider JVM the culprit from the get-go.
Any smart interviewer knows that you have to look at actual code of the contributions to confirm it was actually accepted and that it was a non-trivial change (e.g. not updating punctuation in the README or something).
In my experience this is where the PR-spammers fall apart in interviews. When they proudly tell you they’re a contributor to a dozen popular projects and you ask for direct links to their contributions, they start coming up with excuses for why they can’t find them or their story changes.
There are of course lazy interviewers who will see the resume line about having contributed to popular projects and take it as strong signal without second guessing. That’s what these people are counting on.
I’ll bet there are probably also people trying to farm accounts with plausible histories for things like anonymous supply chain attacks.
My guess is it's mostly people from countries with a culture that reward shameless behavior.
An example I have of this is from high school where there were guys that were utterly shameless in asking girls for sex. The thing is it worked for them. Regardless of how many people turned them down they got enough of a hit rate it was an effective strategy. Simply put there was no other social mechanism that provided enough disincentive to stop them.
And to take the position as devil's advocate, why should they feel shame? Shame is typically a moral construct of the culture you're raised in and what to be ashamed for can vary widely.
For example, if your raised in the culture of Abrahamic religions it's very likely you're told to be ashamed for being gay. Whereas non-religious upbringing is more likely to say why the hell would you be ashamed for being gay.
TL:DR, shame is not an effective mechanism on the internet because you're dealing with far too many cultures that have wildly different views on shame, and any particular viewpoint on shame is apt to have millions to billions of people that don't believe the same.
I am seeing the doomed future of AI math: just received another set theory paper by a set theory amateur with an AI workflow and an interest in the continuum hypothesis.
At first glance, the paper looks polished and advanced. It is beautifully typeset and contains many correct definitions and theorems, many of which I recognize from my own published work and in work by people I know to be expert. Between those correct bits, however, are sprinkled whole passages of claims and results with new technical jargon. One can't really tell at first, but upon looking into it, it seems to be meaningless nonsense. The author has evidently hoodwinked himself.
We are all going to be suffering under this kind of garbage, which is not easily recognizable for the slop it is without effort. It is our regrettable fate.
I think this is interesting too. I've noticed the difference in dating/hook-up contexts. The people you're talking about also end up getting laid more but that group also has a very large intersection with sex pests and other shitty people. The thing they have in common though is that they just don't care what other people think about them. That leads some of them to be successful if they are otherwise good people... or to become borderline or actual crininals if not. I find it fascinating actually, like how does this difference come about and can it actually be changed or is it something we get early in life or from the genetic lottery.
The grift culture has changed that completely, now students face a lot of pressure to spam out PRs just to show they have contributed something.
i.e. imagine a change that is literally a small diff, that is easy to describe as a mere user and not a developer, and that requires quite a lot of deep understanding merely to submit as a PR (build the project! run the tests! write the template for the PR!).
Really a lot of this stuff ends up being a kind of failure mode of various projects that we all fall into at some point where "config" is in the code and what could be a simple change and test required a lot of friction.
Obviously not all submissions are going to be like this but I think I've tried a few little ones like that where I would normally just leave whatever annoyance I have alone but think "hey maybe it's 10 min faff with AI and a PR".
The structure of the project incentives kind of creates this. Increasing cost to contribution is a valid strategy of course, but from a holistic project point of view it is not always a good one especially assuming you are not dealing with adversarial contributors but only slightly incompetent ones.
it's easy to not have shame when you have no skin in the game... this is similar to how narcissists think so highly of themselves, it's never their fault
And this is one half of why I think
"Bad AI drivers will be [..] ridiculed in public."
isn't a good clause. The other is that ridiculing others, not matter what, is just no decent behavior. Putting it as a rule in your policy document makes it only worse.
Shaming people for violating valid social norms is absolutely decent behaviour. It is the primary mechanism we have to establish social norms. When people do bad things that are harmful to the rest of society, shaming them is society's first-level corrective response to get them to stop doing bad things. If people continue to violate norms, then society's higher levels of corrective behaviour can involve things like establishing laws and fining or imprisoning people, but you don't want to start with that level of response. Although putting these LLM spammers in jail does sound awfully enticing to me in a petty way, it's probably not the most constructive way to handle the problem.
The fact that shamelessness is taking over in some cultures is another problem altogether, and I don't know how you deal with that. Certain cultures have completely abdicated the ability to influence people's behaviour socially without resorting to heavy-handed intervention, and on the internet, this becomes everyone in the world's problem. I guess the answer is probably cultivation of spaces with strict moderation to bar shameless people from participating. The problem could be mitigated to some degree if a Github-like entity outright banned these people from their platform so they could not continue to harass open-source maintainers, but there is no platform like that. It unfortunately takes a lot of unrewarding work to maintain a curated social environment on the internet.
To demand public humiliation doesn’t just put you on the same level as our medieval ancestors, who responded to violations of social norms with the pillory - it’s actually even worse: the contemporary internet pillory never forgets.
Shame is also not the same thing as "public humiliation". They are publicly humiliating themselves. Pointing out that what they publicly chose to do themselves is bad is in no way the same as coercing them into being humiliated, which is what "public humiliation as a medieval punishment" entails. For example, the medieval practice of dragging a woman through the streets nude in order to humiliate her is indeed abhorrent, but you can hardly complain if you march through the streets nude of your own volition, against other people's desires, and are then publicly shamed for it.
What negative experience do you think should instead be created for people breaking these rules?
A permanent public internet pillory isn’t just useless against the worst offenders, who are shameless anyway. It’s also permanently damaging to those who are still learning societal norms.
The Ghostty AI policy lacks any nuance in this regard. No consideration for the age or experience of the offender. No consideration for how serious the offense actually was.
Tit for tat
What is written in the Ghostty AI policy lacks any nuance or generosity. It's more like a Grim Trigger strategy than Tit for Tat.
It is understanding of these dynamics that lead to us to our current system of law: punitive justice, but forgiveness through pardons.
"This person contributed to a lot of projects" heuristic for "they're a good and passionate developer" means people will increasingly game this using low-quality submissions. This has been happening for years already.
Of course, AI just added kerosene to the fire, but re-read the policy and omit AI and it still makes sense!
A long term fix for this is to remove the incentive. Paradoxically, AI might help here because this can so trivially be gamed that it's obvious it's not longer any kind of signal.
The economics of it have changed, human nature hasn’t. Before 2023 (?) people also submitted garbage PRs just to be able to add “contributed to X” to their CV. It’s just become a lot cheaper.
No, this problem isn't fundamentally about AI, it's about "social" structure of Github and incentives it creates (fame, employment).
Covers most of the points I'm sure many of us have experienced here while developing with AI. Most importantly, AI generated code does not substitute human thinking, testing, and clean up/rewrite.
On that last point, whenever I've gotten Codex to generate a substantial feature, usually I've had to rewrite a lot of the code to make it more compact even if it is correct. Adding indirection where it does not make sense is a big issue I've noticed LLMs make.
However:
> AI generated code does not substitute human thinking, testing, and clean up/rewrite.
Isn't that the end goal of these tools and companies producing them?
According to the marketing[1], the tools are already "smarter than people in many ways". If that is the case, what are these "ways", and why should we trust a human to do a better job at them? If these "ways" keep expanding, which most proponents of this technology believe will happen, then the end state is that the tools are smarter than people at everything, and we shouldn't trust humans to do anything.
Now, clearly, we're not there yet, but where the line is drawn today is extremely fuzzy, and mostly based on opinion. The wildly different narratives around this tech certainly don't help.
It seems to be the goal. But they seem very far away from achieving that goal.
One thing you probably account for is that most of the proponents of these technologies are trying to sell you something. Doesn't mean that there is no value to these tools, but the wild claims about the capabilities of the tools are just that.
You may hire a genius developer that's better than you at everything, and you still won't trust them blindly with work you are responsible for. In fact, the smarter they are than you, the less trusting you can afford to be.
Finally an AI policy I can agree with :) jokes aside, it might sound a bit too agressive but it's also true that some people have really no shame into overloading you with AI generated shit. You need to protect your attention as much as you can, it's becoming the new currency.
One of the theorized reasons for junk AI submissions is reputation boosting. So maybe this will help.
And I think it will help with people who just bought into the AI hype and are proceeding without much thought. Cluelessness can look a lot like shamelessness at first.
Presumably people want this for some kind of prestige, so they can put it on their CV (contributed to ghostty/submitted security issue to curl).
If we change that equation to have them think "wait, if I do this, then when employers Google me they'll see a blog post saying I'm incompetent" changes calculation that is neutral/positive for if their slop gets accepted to negative/positive.
Seems like it's addressing the incentives to me.
I would expect this is entirely uncontroversial and the AI qualifier redundant.
This sort of request may have made sense in the old days but as the quality of generated code rapidly increases, so does the necessity of human intervention decrease.
If you don't check it yourself, then you're going to own whatever your tooling misses, and also own the amount of others' time you waste through what the project has decided to categorize as negligence, which will make you look worse than if you simply made an honest mistake.
“ Ultimately, I want to see full session transcripts, but we don't have enough tool support for that broadly.”
I have a side project, git-prompt-story to attach Claude Vode session in GitHub git notes. Though it is not that simple to do automatic (e.g. i need to redact credentials).
My latest attempt at this is https://github.com/simonw/claude-code-transcripts which produces output like the is: https://gisthost.github.io/?c75bf4d827ea4ee3c325625d24c6cd86...
At a minimum it will help you to be skeptical at specific parts of the diff so you can look at those more closely in your review. But it can inform test scenarios etc.
I think AI could help with that.
https://simonw.substack.com/p/a-new-way-to-extract-detailed-...
Our evolving AI policy is in the same spirit as ghostty's, with more detail to address specific failure modes we've experienced: https://zulip.readthedocs.io/en/latest/contributing/contribu...
What's the reason for this?
Media is the most likely thing I'd consider using AI for as part of a contribution to an open source project.
My code would be hand crafted by me. Any AI use would be similar to Google use: a way to search for examples and explanations if I'm unclear on something. Said examples and explanations would then be read, and after I understand what is going on I'd write my code.
Any documentation I contributed would also be hand written. However, if I wanted to include a diagram in that documentation I might give AI a try. It can't be worse than my zero talent attempts to make something in OmniGraffle or worse a photograph of my attempt to draw a nice diagram on paper.
I'd have expected this to be the least concerning use of AI.
I find this distinction between media and text/code so interesting. To me it sounds like they think "text and code" are free from the controversy surrounding AI-generated media.
But judging from how AI companies grabbed all the art, images, videos, and audio they could get their hands on to train their LLMs it's naive to think that they didn't do the same with text and code.
It really isn't, don't you recall the "protests" against Microsoft starting to use repositories hosted at GitHub for training their own coding models? Lots of articles and sentiments everywhere at the time.
Seems to have died down though, probably because most developers seemingly at this point use LLMs in some capacity today. Some just use it as a search engine replacement, others to compose snippets they copy-paste and others wholesale don't type code anymore, just instructions then review it.
I'm guessing Ghostty feels like if they'd ban generated text/code, they'd block almost all potential contributors. Not sure I agree with that personally, but I'm guessing that's their perspective.
I've written a fair amount of open source code. On anything like a per-capita basis, I'm way above median in terms of what I've contributed (without consent) to the training of these tools. I'm also specifically "in the crosshairs" in terms of work loss from automation of software development.
I don't find it hard to convince myself that I have moral authority to think about the usage of gen AI for writing code.
The same is not true for digital art.
There, the contribution-without-consent, aka theft, (I could frame it differently when I was the victim, but here I can't) is entirely from people other than me. The current and future damages won't be born by me.
I've written _a lot_ of open source MIT licensed code, and I'm on the fence about that being part of the training data. I've published it as much for other people to use for learning purposes as I did for fun.
I also build and sell closed source commercial JavaScript packages, and more than likely those have ended up in the training data as well. Obviously without consent. So this is why I feel strong about making this separation between code and media, from my perspective it all has the same problem.
But now we have some kind of electronic brains that can also generate code, not at the level of the best human brains out there but good enough for most projects. And they are quicker and cheaper than humans, for sure.
So maybe in the end this will reduce the need for human contributions to opensource projects.
I just know that as a solo developer AI coding agents enable me to tackle projects I didn't think about event starting before.
Sanitization practices of AI are bad too.
Let me be clear nothing wrong with AI in your workflow, just be an active participator in your code. Code is not meant to be one and done.
You will go through iteration after iteration, security fix after fix. This is how development is.
You'd need that kind of sharp rules to compete against unhinged (or drunken) AI drivers and that's unfortunate. But at the same time, letting people DoS maintainers' time at essential no cost is not an option either.
I can see this being a problem. I read a thread here a few weeks ago where someone was called out on submitting an AI slop article they wrote with all the usual tells. They finally admitted it but said something to the effect they reviewed it and stood behind every line.
The problem with AI writing is at least some people appear incapable of critically reviewing it. Writing something yourself eliminates this problem because it forces you to pick your words (there could be other problems of course).
So the AI-blind will still submit slop under the policy but believe themselves to have reviewed it and “stand behind” it.
The fact that some people will straight up lie after submitting you a PR with lots of _that type_ of comment in the middle of the code is baffling!
Maybe a bit unlikely, but still an issue no one is really considering.
There has been a single ruling (I think) that AI generated code is uncopyrightable. There has been at least one affirmative fair use ruling. Both of these are from the lower courts. I'm still of the opinion that generative AI is not fair use because its clearly substitutive.
However, at this point, the economic impact of trying to de tangle this mess would be so large, the courts likely won't do anything about it. You and I don't get to infringe on copyright; Microsoft, Facebook and Google sure do though.
Licenses determine whether a copyright lawsuit is likely to happen. Most entities won't sue you if they expect to lose. But they are not the only deciding factor. Some entities never sue, which means you don't have to follow their licenses.
It’s illegal to commit fraud or murder, but if you do it and suffer no consequences (perhaps you even get pardoned by your president), does it matter that it was illegal? Laws are as strong as their enforcement.
For a less grim and more explicit example, Apple has a policy on the iOS App Store that apps may not use notifications to advertise. Yet it happens all the time, especially from big players like Uber. Apple themselves have done it too. So if you’re a bad actor and disrespectful to your users, does it matter that the rule exists?
You may become a big enough target only when it's too late to undo it.
But I've never had the gall to let my AI agent do stuff on other people's projects without my direct oversight.
on a related note: i wish we could agree on rebranding the current LLM-driven never-gonna-AGI generation of "AI" to something else… now i'm thinking of when i read the in-game lore definition for VI (Virtual Intelligence) back when i played mass effect 1 ;)
I might copy it for my company.
Another idea is to simply promote the donation of AI credits instead of output tokens. It would be better to donate credits, not outputs, because people already working on the project would be better at prompting and steering AI outputs.
In an ideal world sure, but I've seen the entire gamut from amateurs making surprising work to experts whose prompt history looks like a comedy of errors and gotchas. There's some "skill" I can't quite put my finger on when it comes to the way you must speak to an LLM vs another dev. There's more monkey-paw involved in the LLM process, in the sense that you get what you want, but do you want what you'll get?
I work in a team of 5 great professionals, there hasn't been a single instance since Copilot launched in 2022 that anybody, in any single modification did not take full responsibility for what's been committed.
I know we all use it, to different extent and usage, but the quality of what's produced hasn't dipped a single bit, I'd even argue it has improved because LLMs can find answers easier in complex codebases. We started putting `_vendor` directories with our main external dependencies as git subtrees, and it's super useful to find information about those directly in their source code and tests.
It's really as simple. If your teammates are producing slop, that's a human and professional problem and these people should be fired. If you use the tool correctly, it can help you a lot finding information and connecting dots.
Any person with a brain can clearly see the huge benefit of these tools, but also the great danger of not reviewing their output line by line and forfeiting the constant work of resolving design tensions.
Of course, open source is a different beast. The people committing may not be professionals and have no real stakes so they get little to lose by producing slop whereas maintainers are already stretched in their time and attention.
Agree, slop isn't "the tool is so easy to use I can't review the code I'm producing", slop is the symptom of "I don't care how it's done, as long as it looks correct", and that's been a problem before LLMs too, the difference is how quickly you reach the "slop" state now, not that you have gate your codebase and reject shit code.
As always, most problems in "software programming" isn't about software nor programming but everything around it, including communication and workflows. If your workflow allows people to not be responsible for what they produce, and if allows shitty code to get into production, then that's on you and your team, not on the tools that the individuals use.
> Ghostty is written with plenty of AI assistance, and many maintainers embrace AI tools as a productive tool in their workflow. As a project, we welcome AI as a tool!
> Our reason for the strict AI policy is not due to an anti-AI stance, but instead due to the number of highly unqualified people using AI. It's the people, not the tools, that are the problem.
Basically don't write slop and if you want to contribute as an outsider, ensure your contribution actually is valid and works.
Surely they are incapable of producing slop because they are just so much smarter than everyone else so the rules shouldn't apply to them, surely.
Moreover this policy is strictly unenforceable because good AI use is indistinguishable from good manual coding. And sometimes even the reverse. I don't believe in coding policies where maintainers need to spot if AI is used or not. I believe in experienced maintainers that are able to tell if a change looks sensible or not.
There's some sensible, easily-judged-by-a-human rules in here. I like the spirit of it and it's well written (I assume by Mitchell, not Claude, given the brevity).
https://raw.githubusercontent.com/ghostty-org/ghostty/refs/h...
Actually, trying to load that previous platform on my phone makes it worse for readability, seems there is ~10% less width and not as efficient use of vertical space. Together with both being unformatted markdown, I think the raw GitHub URL seems to render better on mobile, at least small ones like my mini.
EDIT: I'm getting downvoted with no feedback, which is fine I guess, so I am just going to share some more colour on my opinion in case I am being misunderstood
What I meant with analogous to phishing is that the intent for the work is likely the one of personal reward and perhaps less of the desire to contribute. I was thinking they want their name on the contributors list, they want the credit, they want something and they don't want to put effort on it.
Do they deserve to be ridiculed for doing that? Maybe. However, I like to think humans deserve kindness sometimes. It's normal to want something, and I agree that it is not okay to be selfish and lazy about it (ignoring contribution rules and whatnot), so at minimum I think respect applies.
Some people are ignorant, naive, and are still maturing and growing. Bullying them may not help (thought it could) and mockery is a form of aggression.
I think some true false positives will fall into that category and pay the price for those who are truly ill intended.
Lastly, to ridicule is to care. To hate or attack requires caring about it. It requires effort, energy, and time from the maintainers. I think this just adds more waste and is more wasteful.
Maybe those wordings are there just to 'scare' people away and maintainers won't bother engaging, though I find it is just compounding the amount of garbage at this point and nobody benefits from it.
Anyways, would appreciate some feedback from those of you that seem to think otherwise.
Thanks!
PS: What I meant with ghostty should "ghost" them was this: https://en.wikipedia.org/wiki/Shadow_banning
Interesting requirement! Feels a bit like asking someone what IDE they used.
There shouldn't be that meaningful of a difference between the different tools/providers unless you'd consistently see a few underperform and would choose to ban those or something.
The other rules feel like they might discourage AI use due to more boilerplate needed (though I assume the people using AI might make the AI fill out some of it), though I can understand why a project might want to have those sorts of disclosures and control. That said, the rules themselves feel quite reasonable!