I think it's valuable for developers to understand more of their code rather than less, but who cares to precisely label how much they understand? If they're happy with the passing tests, comfortable making it public, and others want to contribute, then that's what matters.
Although I don't endorse it for most use cases, I like the distinction. There are some things I vibe code that are useful in the moment but I always throw out
Meanwhile, articles such as this one (https://news.ycombinator.com/item?id=45412263) get spammed with people yelling for “examples”, such as exactly what’s here.
One thing AI has changed for me (beyond, you know, everything) is making me really depressed about the state of the HN community. It seems HN itself hasn’t been immune to the severe social media toxicity pandemic going around… merely doing better than the gen pop alternatives.
I can still share discontent with the state of things. It’s a thing, you know.
> one problem, of course, was that the tests were entirely bullshit.
> whenever it would start getting lazy or confused i'd restart the session. often a failure would "demoralize" it or being sloppy once would cause sloppiness to stick. in particular i've noticed that being overwhelmed causes it to approach problems in a messy "throw anything against the wall" way. sometimes if too many newly un-skipped tests are causing failures and it got "demoralized", i'd just skip them again and have it focus on one or two at a time. with less noisy output and a permission to "really dig into what happened" (and often an explicit suggestion to remove things from the example until it no longer breaks), it would usually find the root cause.
> we got to majority of passing tests but there were a bunch of bugs it just couldn't solve and would walk in circles. the code was also getting quite complicated. it seemed like a mess of different ideas and special cases thrown in. moreover, i knew it didn't fully work because i had new test cases that just would refuse to pass
> i tried to let it do that and it just failed miserably anyway, breaking tests and not being able to recover.
> it struggled at first but i reminded it to look at other emitters.
> at one point, newly added tests kept confusing claude. it would completely get stuck on them, failure after failure, fixing one thing and breaking other thing, trying to turn off those tests or change the expected outputs (despite me telling it to never do that!) and in general seeming aimless and distraught (in the descriptive sense).
> i had to git reset --hard multiple times in this mess.
—
tldr; seems like a very nice experience, yeah
I've experienced the same - contributing a very large PR to a golang project (without knowing or having worked with the language prior). I did it because I could talk through abstractions, be willing to down dead ends (1:3 ratio for every meaningful feature), and be OK with the fun of redoing. Once you are able to do this, you literally become a 10X engineer when measured by working output.
If this process of trying and discarding 2 out of every 3 approaches sounds distasteful, you will not truly discover the deeper joys of working with the SOTA LLMs.
> maybe my project is a toy (it is) or you think it's poor quality (it's not) but i'm able to do things in minutes that used to take days
Just consider what this will be like as it gets better? Remember we've had working coding agents for less than a year.
People are excited not because it's fun to fight with the damn things. It's not! We're excited despite that!
I remember my old Nokia 6682. It was an early smartphone that ran S60 and I had a screen reader, basic IM client, and a few other apps including a web browser installed. It was awkward to use. It was frustrating. The connection was dog-slow. And it was cool as hell--a little slice of the future in my pocket.
I remember my Windows 98 (first edition) machine with JAWS for Windows 3.2, trying to use the early web; before they had the concept of the virtual cursor. Before any of this accessibility stuff was at all standardized, when we got what we could by scraping screen buffers and injecting into other processes. And damn it was so cool. So obviously the future that we put up with the jank.
Here we are again. Annoying to use? Often! Remarkable? Hell yeah!
Except this time we have people combing through every sentence to extract only the negative ones from a 40kb success story--I do at least hope you used an LLM for this.
See how airjets progressed in the last 50 years (they haven’t).
—
This thing you are talking about doesn’t understand its output.
I would love this to progress however.
— If you are interested in human-level AI don’t work on LLMs
https://www.newsweek.com/nw-ai/ai-impact-interview-yann-lecu...
In longer form, I do think people should feel bad for writing like this. Like you said, capitalization has genuine utility and, without them, it makes the blog post a nightmare to read.
Case in point: even Sam doesn't write blog posts like this https://blog.samaltman.com
perhaps readers expecting you to conform to a particular style of written presentation are the entitled ones
what a disrespect for the author/typesetter to presume to dictate what is ok for tweets versus longreads :P