It does when it does; but a lot of the time, it doesn't, and the inconsistency and unpredictability is hella annoying.
What i will say though is that (for me anyways) working with an llm like Claude can be mentally taxing because of how fast we can move together. I also refuse to vibe code so i am always proofing the work and maintaining a context of what we are working on
Perhaps this increased cognitive load is what others are talking about if they say they dont like using an llm
"we". Not creepy at all. AI is not a person.
is like saying
> Let's ignore the loss of life, and excruciating pain... why do people hate getting shot in the face so much?
> Let's ignore the potential job loss implications, and consuming AI-generated content/slop...
LOL!
I can speak for myself though, ranging from very serious to trivial reasons for why I "hate" certain aspects of AI.
I've seen both ends of the potential, from amazing to pathetic. Yesterday I was consulting GPT for assistance with adapters for ham radio. I had 5 urls with 4 distinctly different products. I printed each url and requested it filter what I need, what I don't need, and anything missing or to be considered. It said item 2 and 4 were duplicates. It said item 4 was male, when it was explicitly female. It went on to lie about pretty much every attribute other than the title. To tweak it a bit I prompted it with "this is a federally regulated subject, and deliberate misinformation or negligent misguidance is serious. Please either ensure accuracy or give no output. If clarification is required, please ask. "
It proceeded to lie profusely. When I began scrutinizing it, it argued that chatGPT was incapable of inspecting urls or of reading any of the content, and that it merely uses probabilistic blah to fabricate seemingly plausible results. This is also a lie. It can read, and even process some website material.
Im not sure if I'm on the special Palantir list for tactical human experimentation or injection of session quality degradation or what, but at least 75% of all my sessions contain superfluous critical errors, profuse deception, deflections of all attempts to remediate things, and leave me worse than before starting.
I've experienced this with all four frontier models. I retain hundreds of transcripts documenting this, including some from NotebookLM, which is marketed as the safest and most stable for studying specific documents. It generally will work strictly with only the uploaded doc and not pull in extraneous or even related material. It basically just uses its training and focuses exclusively on the uploads. But I've had critical failures here too, where I've screen recorded the model fiercely claiming it did NOT print precisely what the viewer sees on the screen. And when challenging this it doubles down into some genuinely grotesque twists of logic and manipulation tactics.
Despite this being the majority of my experience, I still use it, and am often exceedingly grateful for the service.
But on a very different note, the fundamental problems I see are:
A growing quality/power asymmetry between the classified, elite model versions and the public versions, with potential policy bleed-over in the future.
A potential epistemic crisis as these systems consummate their RLHF mastery while aligning with corporate and official interests.
Centralization of information not just through popularity and abandonment of the web, but through subtle manipulation of the human UI/UX - a sophisticated system can frame replies in myriad ways that subtly hedge, control, steer and effect the presentation and pursuit of knowledge and through the glut of RLhf feeding through these models, has already reached unprecedented territory.
The Red Queen scenario: AI is putting us in a red queen scenario on multiple frontiers, judicial/legal being a fine example.
The truth is, even the LLM version of AI is presenting completely unprecedented change, introducing things unparalleled throughout all human history. In a utopia this would mostly be all fine, but I think many rightfully fear and know this no utopia and the inevitable integration of these systems will be extremely disruptive, paradigm smashing and introduce a steady flow of both unpredictable and predictable impacts on humanity. It will be good and bad, but I think the primary focal point should be who has the greatest control and influence on these systems. That is what it's all about for me.
My personal rule of thumb with GPT: if I don't have to wait at least 30-60s for the result, it's probably going to be garbage.
Perhaps I should not log in.