Minecraft Source Code Is Interesting(www.karanjanthe.me)

32 pointsby KMJ-0079 hours ago2 comments

slopinthebag8 hours ago
Once again, a promising article is completely ruined by blatant ai-isms. I could only make to the end of the pointer section before I couldn't take it anymore.
There is a real crisis of AI slop getting posted to this forum. I don't even bother reading posted articles related to AI anymore, but now it's seemingly extending to everything.
- schwede2 hours ago
  Why can’t people add a disclaimer when their text was written or edited with AI? That is my completely unrealistic wish for the world today…
  - slopinthebag41 minutes ago
    We are going to need some proof on cleanliness for art, writing etc. At some point it will become impossible to tell if you are interacting with a bot and that is when the internet dies.
- wvenable8 hours ago
  I didn't notice until "this turns a lighting update from “noticeable stutter” into “instant.”"
  - slopinthebag7 hours ago
    "This means reading light data requires zero locks. No mutex, no spinlock, nothing." threw up red flags, and by the time I got to "But here’s the insight" I couldn't go any further.
- softskunk7 hours ago
  i would genuinely rather read the rough draft before it got turned into this slop. it would be messier, maybe, but it’d have actual human insight and direction.
- user39393827 hours ago
  I’ve been trying to put my finger on what gives it away. It’s that there are boolean trees underneath each text decision it makes. While humans are obviously capable of that, our conclusions and framing are more continuous. This why you for example see LLMs constantly defining things by what they’re not.
  - oidar6 hours ago
    The main issue is sota LLMs can only reason one way - forwards, and can't go back and revise a prior statement. That would remove a whole lot of "it's not this is that" and "the big takeaway here is" and so on. Those kinds of ideas are typically at the beginning of a human writer's output structure. An LLM can't go back and edit the first paragraph, because it has to reason (for whatever that means for LLM) it's way through it to get to the big idea of the paragraph/structure. I haven't played with diffusion text models enough to know if that's a remedy for that kind of output.
    When LLMs are good enough to not be detectable, what happens then? They aren't that far away atm, so it's only a matter of time until _everyone_ is assumed to be an LLM.
  - dvt7 hours ago
    LLMs are trained to be precise (and more specifically: semantically precise), especially in the fine-tuning phase. An LLM just trained on the corpus of full human production would surely sound more "human," but it would also probably be pretty useless. So that's why idioms like "it's not X, it's Y" are a dead giveaway; but really, any structure that tries to "guide" our salience is a dead giveaway. Here's a random paragraph from Knuth's Literate Programming†[1]:
    > For example, a system program is often designed to gather statistics about its own operation, but such statistics-gathering is pointless unless someone is actually going to use the results. In order to make the instrumentation code optional, I include the word ‘stat’ just before any special code for statistics, and ‘tats’ just after such code; and I tell WEAVE to regard stat and tats as if they were begin and end. But stat and tats are actually simple macros.
    I encourage you to read that paragraph a few times. Even if you have no idea what the context is, you get that there's a point, that there's something else to dig into, that the author might be being a bit cheeky. In other words, you can feel Knuth behind the ink. Philosophers would call this intentionality[2]. LLMs produce the polar opposite of garden path sentences[3] (and, imo, that's why they're so easy to spot).
    † I specifically picked something technical to illustrate that even in domains where semantic precision is of utmost importance, human expression is still just that: human.
    [1] https://www.cs.tufts.edu/~nr/cs257/archive/literate-programm...
    [2] https://plato.stanford.edu/archives/fall2014/entries/intenti...
    [3] https://www.sjsu.edu/writingcenter/docs/handouts/Garden%20Pa...
    slopinthebag6 hours ago
    I fed ChatGPT part of the introductory paragraph of that paper and instructed it to continue. It gave me this:
    > In this light, the activity of programming becomes less an act of issuing commands and more an act of communication. The computer is, after all, an obedient but uncomprehending servant; it will execute whatever precise instructions we provide. But our colleagues, our future selves, and the broader community of readers are not so easily satisfied. They demand clarity, intention, and narrative. A program, then, should be structured not merely for execution, but for reading—its logic unfolding in a manner that mirrors the way one might naturally explain the solution to another person.
    > This shift in perspective has practical consequences. When we write with exposition in mind, we are compelled to confront ambiguities that might otherwise remain hidden. Vague assumptions must be made explicit; convoluted steps must be reorganized into simpler, more digestible ideas. The discipline of explaining a program often leads to improvements in the program itself, since confusion in the prose is frequently a symptom of confusion in the underlying design.
    Fascinating technology. I would not be able to immediately tell this was AI generated. So these models can in some cases produce text that doesn't immediately set off alarm bells. As an avid reader and writer I'm not really sure what to make of it. I don't want to consume AI generated art or literature because it's completely besides the point, but in the future will we even be able to tell? How do we even know if anyone around us is real? Could they just be sufficiently advanced LLM's, fooling us? Am I the only human in the matrix?
    svatan hour ago
    Whether or not one can tell it's AI generated, one can certainly tell it's not Knuth. For one thing, the writing style is very different. Not that there haven't been other great computer scientists who may have written in this style, but it definitely doesn't sound like Knuth (there is no "being a bit cheeky" for sure). But also, the ideas it has produced are simply more of the same; kind of a natural progression / what a typical grad student may write. Knuth always has something new and surprising to say in every paragraph, he wouldn't harp on a theme like this. Also he mixes “levels” between very high and very low, while the paragraphs you quoted stay at a uniform level.
    But of course, writing as good as a grad student's (just not the particular delightful idiosyncratic style of a specific person) is still very impressive and amazing, so your concerns are still valid.
    dvtan hour ago
    Knuth's paper is 100% in the training set, so while your result is decent, it's undoubtedly tainted. But let's look at the output anyway:
    > ...the activity of programming becomes less an act of issuing commands and more an act of communication
    directly contradicts:
    > The computer is, after all, an obedient but uncomprehending servant...
    If programming becomes "an act of communication" how can an "uncomprehending servant" make heads or tails of what I'm telling it? And I get that the two aren't exactly contradictory here, but this implied claim would certainly require at least a throwaway sentence.
    > When we write with exposition in mind, we are compelled to confront ambiguities that might otherwise remain hidden.
    I'm being a bit nitpicky, but this is a non-sequitur; we aren't necessarily required to confront any ambiguities, even when we're trying very hard to be expository. The counter-examples I'm thinking of at the moment are contrived (amnesia, my four-year-old niece trying to tell a story, etc.) but I mainly take issue with the word "compelled."
    > its logic unfolding in a manner that mirrors the way one might naturally explain the solution to another person
    People explain things in all kinds of weird circuitous ways, so while this (as all AI-generated output) seems interesting prima facia, it's actually kind of a dud when you think about it for more than 5 seconds.
    > Vague assumptions must be made explicit; convoluted steps must be reorganized into simpler, more digestible ideas.
    and
    > ...ambiguities that might otherwise remain hidden...
    directly contradicts:
    > ...whatever precise instructions we provide
    It seems like the computer can somehow encode "ambiguities" and "vague assumptions" as "precice instructions." How, exactly, does that work? (Spoiler: it doesn't, it's gibberish.) On the other hand, if you read Knuth's first few paragraphs, he clearly has a point in mind; I'd even say he's being a bit wordy, but never equivocating. In fact, by the fourth paragraph, he's almost giddy with excitement.
- tills137 hours ago
  such a shame too because I'm genuinely interested but like I cannot bring myself to care about AI generated content slop
gurkin7 hours ago
> Its shit too, but our kind of shit.
Unfathomably based.