Why the AI Renaissance Keeps Not Arriving(jamesfbaker.substack.com)

18 pointsby jamesbaker110 hours ago8 comments

MarkusQ9 hours ago
About half the people I discuss this with seem to think LLMs are great and don't see any problems with their output. The other half seem to get nothing but an endless stream of plausibly shaped rubbish.
It doesn't seem to depend on what model they use or how they prompt it. In code, there seems to be a loose correlation with testing styles; I've previously noticed that some people write tests to show that the code works as intended, and others try to write tests to show that it can't fail in ways that were unintended. But that correlation is weak.
I'm really puzzled by this.
- gonzalohm8 hours ago
  It's also not whether the code passes tests or not. Sometimes AI does the thing I ask for, and it works, but it's so different from the way I would do it or it's too verbose so I just scrap it and code it by hand.
  I mostly use it for boilerplate code nowadays. Anything more complicated and it takes me more effort to review the output than to just code it slowly
- saulpw7 hours ago
  Does it seem like it bifurcates based on the person? I've had both experiences myself, sometimes with the same model and within ~days of each other on seemingly similar tasks. It's almost impossible to deny sometimes that actual intelligence is being expressed (and could not be regurgitated intelligence from some random internet page), but then I see firsthand the eye-rolling "intelligence-shaped" output from something else and wonder where I went wrong.
  It kinda feels like Michigan J Claude sometimes.
  - MarkusQ6 hours ago
    > It's almost impossible to deny sometimes that actual intelligence is being expressed (and could not be regurgitated intelligence from some random internet page)
    But how is it impossible? Or rather, how is it possible to distinguish actual intelligence from some internet page--not "some random internet page", but some very well selected, timely and topical internet page?
    Carly Simon's "Killing Me Softly" describes a similar experience, decades before LLMs. It's amazingly easy to feel like someone is understanding you when they are just pattern matching on a common shared experience.
    This seems so likely that I have a hard time understanding why some people think it's impossible.
    saulpw6 hours ago
    I know; if I hadn't experienced it myself, I would have a hard time believing it. Even if we grant that it's "merely" as you say, that it can interpret my loose and often false or misleading rhetoric to identify my question, trawl through thousands of lines of code to determine the core problem, select some very specific and relevant internet page, and then translate its solution fresh into my own problem domain; that in my mind at least "an expression of intelligence", even if it is not truly intelligent or understanding.
    FWIW this is my own mental gymnastics around the Chinese Room; at some point the question "where is the understanding?" is moot, because if the Chinese Room reliably delivers context-specific correct results in unique contexts, then what more do we require of intelligence? I have to admit that sometimes it does better than I can do, and I've been called intelligent by intelligent people my whole life.
    (BTW Killing Me Softly wasn't written/sung/anything by Carly Simon, but composed by Charles Fox with lyrics by Norman Gimbel, in collaboration with Lori Lieberman [after she was inspired by a Don McLean performance in late 1971].)
svachalek7 hours ago
The patterns it talks about are true, but imo largely because of model defaults rather than model range. That is, if you prompt for a pelican on a bicycle, each model is going to have a small variety of ways to do it (the default). But if you add more details and requirements to the prompt, there are many, many different ways even a basic model can solve it (the range).
The additional prompting doesn't necessarily need to tell the model specifically what to fix or do better, sometimes it's just enough to break it out of its habit. Asking for a smart looking, middle aged pelican on a sporty red bike isn't making the problem easier but does break it out of its boring defaults.
I wouldn't go so far as to say PEBKAC but the good news is there's still a role for humans in the loop.
skybrian8 hours ago
Doing things the same, standard way each time is often good. Most of the code we write is obvious. We notice it when it's a weird quirk rather than just being the most straightforward way.
I wonder how long it will take to fix the quirks?
9 hours ago
undefined
bediger40003 hours ago
There's a couple of cyberpunk stories in this one.
Legend24409 hours ago
What do you mean not arriving? It's already here. AI models are awesome and I use them every day, as does every other software dev I know.
xg157 hours ago
Funny if, after all those trillions of tokens and billions of dollars and grandiose arguments about consciousness, the end of humanity or the next stage of evolution, it turns out it really is just a markov chain after all...
andrewstuart7 hours ago
It’s just another anti AI rant this time focused on the fixable problem of AI tending towards the average.
This will get sorted out in time and in the meantime, instruct the LLM away from averaged answers. It’s not a problem.
If you do not want images that have that instantly recognizable AI style to them - busy, perfect, bright colors pseudo realistic, outside what humans would usually make - then instruct the AI that such images are a failure to meet your goals and instead instruct them in other directions.
- dbalatero7 hours ago
  How do you see it getting sorted out over time?
  - andrewstuart7 hours ago
    The LLM developers have a huge amount of money and development resource and in time they’ll do things that make the outcomes better which means being less monotonic and more creative and original.
    You can do it today just by prodding the LLM correctly so it’s not hard for the LLM devs to do in an organized way.
    The future of AI/LLMs will include the development of distinct personalities and behavior characteristics instead of the generic interface to an AI brain that is an averaged monotone.