If you choose to believe as Jaron Lanier does that LLMs are a mashup (or as I would characterize it a funhouse mirror) of the human condition, as represented by the Internet, this sort of implicit bias is already represented in most social media. This is further distilled by the cultural practice of hiring third world residents to tag training sets and provide the "reinforcement learning"... people who are effectively if not actually in the thrall of their employers and can't help but reflect their own sycophancy.
As someone who is therefore historically familiar with this process in a wider systemic sense I need (hope for?) something in articles like this which diagnoses / mitigates the underlying process.
Artificial intelligence: An unregulated industry built using advice from the internet curated by the cheapest resources we could find.
What can we mitigate your responsibility for this morning?
I've had AI provide answers verbatim from a self-promotion card of the product I was querying as if it was a review of the product. I don't want to chance a therapy bot quoting a single source that, whilst it may be adjacent to the problem needing to be addressed, could be wildly inappropriate or incorrect due to the sensitivities inherent where therapy is required.
(likely different sets of weightings for therapy related content, but I'm not going to be an early adopter for my loved ones - barring everything else failing)
I wish I could see hope in the use of LLMs but i don't think the genie goes back into the bottle, the people prone to this kind of delusion will just dig a hole and go deep until they find the willpower or someone on the outside to pull them out. Feels to me like gambling, there's no power that will block gambling apps due to the amount of money they fuel into lobbying so the best we can do is try to help our friends and family and prevent them from being sucked into it.
There were competent kings and competent Empires.
Indeed, it's tough to decide where the Roman Empire really began it's decline. It's not a singular event but a centuries long decline. Same with the Spanish Empire and English Empire.
Indeed, the English Empire may have collapsed but that's mostly because Britain just got bored of it. There's no traditional collapse for the breakup of the British Empire
---------
I can think of some dramatic changes as well. The fall of the Tokugawa Shogunate of Japan wasn't due to incompetence, but instead the culture shock of a full iron battleship from USA visiting Japan when they were still a swords and samurai culture. This broke the Japanese trust in the Samurai system and led to a violent revolution resulting in incredible industrialization. But I don't think the Tokugawa Shogunate was ever considered especially corrupt or incompetent.
---------
Now that being said: Dictators fall into the dictator trap. A bad king who becomes a narcissist and dictator will fall under the pattern you describe. But that doesn't really happen all that often. That's why it's so memorable when it DOES happen
I completely agree with the point you're making, but this part is simply incorrect. The British Empire essentially bankrupted itself during WW2, and much of its empire was made up of money losing territories. This led them to start 'liberating' these territories en masse which essentially signaled the end of the British Empire.
The way Britain has restricted Industry in India (famously even salt) left it vulnerable in WW2.
Colonial policies are really up there with great failures of communists
What does "being historically familiar with a process in a wider systemic sense" mean? I'm trying to parse this sentence without success.
The assumption GP is making is that the incentives, values, and biases impressed upon folks providing RL training data may systematically favor responses along a certain vector that is the sum of these influences in a way that doesn't cancel out because the sample isn't representative. The economic dimension for example is particularly difficult to unbias because the sample creates the dataset as an integral part of their job. The converse would be collecting RL training data from people outside of the context of work.
While that it may not be feasible or even possible to counter, that difficulty or impossibility doesn't resolve the issue of bias.
I read Robert Anton Wilson and Philip K Dick many years ago. I've been observing a recurring feature in human thought / organization ever since. People in this thread have done a pretty good job with the functional psychosis part, but I encourage considering percept / concept as well: what this is is the notion that what we see influences our mental model, but it works the other way as well and our mental model influences what we're capable of seeing. Yes, sort of like confirmation bias, but much more disturbing. For example, in the CIA's online library there is a coursebook titled _Psychology of Intelligence Analysis_ (1999) and one of the topics discussed is: "Initial exposure to blurred or ambiguous stimuli interferes with accurate perception even after more and better information be- comes available." Particularly fascinating to me is that people who are first shown a picture which is too blurry to make out take longer to correctly identify it as it is made clearer. https://www.cia.gov/resources/csi/books-monographs/psycholog...
My father was a psychiatrist. I'm interested in various facets of how people come to regard each other and their surroundings. I'm fascinated with the role language plays in this. I personally believe that computer programming languages and tech stacks provide a uniquely objective framework for evaluating the emergence of "personality" in cultures.
"Diagnosticity is the informational value of an interaction, event, or feedback for someone seeking self-knowledge." https://dictionary.apa.org/diagnosticity
Environments which lack information (diagnosticity) encourage the development of neuroses: sadism, masochism, ritual, fetishism, romanticism, hysteria, superstition, etc., etc. I have observed that left to stew in their own juices the spontaneous cultures which emerge around different languages / stacks tend to gravitate towards language-specific constellations of such neuroses; I'm not the only person who has observed this. I tend towards the "radar chart" methodology described in Leary's _Interpersonal Diagnosis of Personality_ (1957); but here's a great talk someone gave at SXSW one year which explores a Lacanian model: https://www.youtube.com/watch?v=mZyvIHYn2zk
This is too common. I’d like to think the Socratic method and mindset helps one break out of this rut.
Languages like Haskell are really applied type theory etc... In some sense, the academics invent languages for different levels of abstraction to ultimately write papers about how useful they are.
In terms of programming languages, personality wise, in the end it's all javascript. Then there is Java and the Jvm which is on a mission to co-opt multiple personalities.
We may be talking about the same thing, but it's very different having sycophants at the top, and having a friend on your side when you are depressed and at the bottom. Yet both of them might do the same thing. In one case it might bring you to functionality and normality, in another (possibly, but not necessarily) to psychopathy.
They are very useful algorithms which solve for document generation. That's it.
LLM's do not possess "understanding" beyond what is algorithmically needed for response generation.
LLM's do not possess shared experiences people have in order to potentially relate to others in therapy sessions as LLM's are not people.
LLM's do not possess professional experience needed for successful therapy, such as knowing when to not say something as LLM's are not people.
In short, LLM's are not people.
Not that the study wouldn't be valuable even if it was obvious
LLM’s are plagued by poor accuracy so they preform terribly in any situation where inaccuracies have serious downsides and there is no process validating the output. This is a theoretical limitation of the underlying technology, not something better training can fix.
Most unfixable flaws can be worked around with enough effort and skill.
Suppose every time you got into your car an LLM was going to recreate the all safety critical software from an identical prompt but using slightly randomized output. Would you feel comfortable with such an arrangement?
> Most unfixable flaws can be worked around with enough effort and skill.
Not when the underlying idea is flawed enough. You can’t get from the earth to the moon by training yourself to jump that distance, I don’t care who you’re asking to design your exercise routine.
Yeah but the argument about how it works today is completely different from the argument about "theoretical limitations of the underlying technology". The theory would be making it orders of magnitude less common.
> Not when the underlying idea is flawed enough. You can’t get from the earth to the moon by training yourself to jump that distance, I don’t care who you’re asking to design your exercise routine.
We're talking about poor accuracy aren't we? That doesn't fundamentally sabotage the plan. Accuracy can be improved, and the best we have (humans) have accuracy problems too.
LLM’s can’t get 3+ orders of magnitude better here. There’s no vast untapped reserves of clean training data, and tossing more processing power quickly results in overfitting existing training data.
Eventually you need to use different algorithms.
> That doesn’t fundamentally sabotage the pan. Accuracy can be improved
Not nearly far enough to solve the issue.
> Most unfixable flaws can be worked around with enough effort and skill.
Such a ridiculous example of delusional LLM hype, comments like this are downright offensive to me.
“Your therapy bot is telling vulnerable people to kill themselves, they probably should have applied more skill and effort to being in therapy”
Sorry you got offended at a thing I didn't say.
That was a generic comment about designers/engineers/experts, not untrained users.
Also you swapping "can be" with "should" strongly changes the meaning all by itself. Very often you can force a design to work but you should not.
Self help books do not contort to the reader. Self help books are laborious to create, and the author will always be expressing a world model. This guarantees that readers will find chapters and ideas that do not mesh with their thoughts.
LLMs are not static tools, and they will build off of the context they are provided, sycophancy or not.
If you are manic, and want to be reassured that you will be winning that lottery - the LLM will go ahead and do so. If you are hurting, and you ask for a stream of words to soothe you, you can find them in LLMs.
If someone is delusional, LLMs will (and have already) reinforced those delusions.
Mental health is a world where the average/median human understanding is bad, and even counter productive. LLMs are massive risks here.
They are 100% going to proliferate - for many people, getting something to soothe their heart and soul, is more than they already have in life. I can see swathes of people having better interactions with LLMs, than they do with people in their own lives.
quoting from the article:
> In an earlier study, researchers from King's College and Harvard Medical School interviewed 19 participants who used generative AI chatbots for mental health and found reports of high engagement and positive impacts, including improved relationships and healing from trauma.
Not really sure that is relevant in the context of therapy.
> LLM's do not possess shared experiences people have in order to potentially relate to others in therapy sessions as LLM's are not people.
Licensed therapists need not possess a lot of shared experiences to effectively help people.
> LLM's do not possess professional experience needed for successful therapy, such as knowing when to not say something as LLM's are not people.
Most people do not either. That an LLM is not a person doesn't seem particularly notable or relevant here.
Your comment is really saying:
"You need to be a person to have the skills/ability to do therapy"
That's a bold statement.
> Most people do not either. That an LLM is not a person doesn't seem particularly notable or relevant here.
Of relevance I think: LLMs by their nature will often keep talking. They are functions that cannot return null. They have a hard time not using up tokens. Humans however can sit and listen and partake in reflection without using so many words. To use the words of the parent comment: trained humans have the pronounced ability to _not_ say something.
(Of course, finding the right time/occasion to modulate it is the real challenge).
> (Of course, finding the right time/occasion to modulate it is the real challenge).
This is tantamount to saying:
All you have to do to solve a NP-hard[0] problem is to
make a polynomial solution.
(Of course, proving P = NP is the real challenge).
0 - https://en.wikipedia.org/wiki/NP-hardnessGP seems to have a legitimate point though. The absence of a workable solution at present does not imply the impossibility of such existing in the not so distant future.
An LLM, especially chatgpt is like a friend who's on your side, who DOES encourage you and takes your perspective every time. I think this is still a step up from loneliness.
And a final point, ultimately an LLM is a statistical machine that takes the most likely response to your issues based on an insane amount of human data. Therefore it is very likely to actually make some pretty good calls about what it should respond, you might even say it takes the best (or most common) in humanity and reflects that to you. This also might be better than a therapist, who could easily just view your sitation through their own live's lense, which is suboptimal.
Sure, they don't need to have shared experiences, but any licensed therapist has experiences in general. There's a difference between "My therapist has never experienced the stressful industry I work in" and "My therapist has never experienced pain, loneliness, fatigue, human connection, the passing of time, the basic experience of having a physical body, or what it feels like to be lied to, among other things, and they are incapable of ever doing so."
I expect if you had a therapist without some of those experiences, like a human who happened to be congenitally lacking in empathy, pain or fear, they would also be likely to give unhelpful or dangerous advice.
Generally a non-person doesn’t have skills, it’s a pretty likely to be true statement even if made on a random subject.
> Generally a non-person doesn’t have skills,
A semantic argument isn't helpful. A chess grandmaster has a lot of skill. A computer doesn't (according to you). Yet, the computer can beat the grandmaster pretty much every time. Does it matter that the computer had no skill, and the grandmaster did?
That they don't have "skill" does not seem particularly notable in this context. It doesn't help answer "Is it possible to get better therapy from an LLM than from a licensed therapist?"
2. why would this "study" exist? - for the reason computer science academics conduct study on whether LLMs are empirically helpful in software engineering. (The therapy industrial complex would also have some reasons to sponsor this kind of a research, unlike SWE productivity studies where the incentive is usually the opposite.)
For the record, my initial question was more rhetorical in nature, but I am glad you took the time to share your thoughts as it gave me (and hopefully others) perspectives to think about.
> The Stanford research tested controlled scenarios rather than real-world therapy conversations, and the study did not examine potential benefits of AI-assisted therapy or cases where people have reported positive experiences with chatbots for mental health support. In an earlier study, researchers from King's College and Harvard Medical School interviewed 19 participants who used generative AI chatbots for mental health and found reports of high engagement and positive impacts, including improved relationships and healing from trauma. ** > "This isn't simply 'LLMs for therapy is bad,' but it's asking us to think critically about the role of LLMs in therapy," Haber told the Stanford Report, which publicizes the university's research. "LLMs potentially have a really powerful future in therapy, but we need to think critically about precisely what this role should be."
On the ground, it's wildly different. For me, a very left field moment.
I imagine if you go to psychology conferences you get exposed to the professional side a lot more, but for the average internet user that's very different. I wouldn't be surprised if the AI girlfriend sites had many, many orders of magnitude more users
If journalists got transcripts and did followups they would almost certainly uncover egregiously bad therapy being done routinely by humans.
"These people are credentialed professionals so I'm sure they're fine" is an extremely dangerous and ahistorical position to take.
Citation needed.
Also: Psychotherapy is not a school but is divided in many different schools.
'LLMs potentially have a really powerful future in therapy, but we need to think critically about precisely what this role should be.'
And they also mention a previous paper that found high levels of engagement from patients.
So, they have potential but currently are giving dangerous advice. It sounds like they are saying a fine tuned therapist model is needed because 'you are a great therapist' prompt, just gives you something that vaguely sounds like a therapist to an outsider.
Sounds like an opportunity honestly.
Would people value a properly trained therapist enough to pay for it over an existing chatgpt subscription?
One problem is that the advice is dangerous, but there's an entirely different problem, which is the LLM becoming a crutch that the person relies on because it will always tell them what they want to hear.
Most people who call suicide hotlines aren't actually suicidal - they're just lonely or sad and want someone to talk to. The person who answers the phone will talk to them for awhile and validate their feelings, but after a little while they'll politely end the call. The issue is partly that people will monopolize a limited resource, but even if there were an unlimited number of people to answer the phone, it would be fundamentally unhealthy for someone to spend hours a day having someone validate their feelings. It very quickly turns into dependency and it keeps that person in a place where they aren't actually figuring out how to deal with these emotions themselves.
Mechanical Turk anyone?
Actual therapy requires more unsafe topics than regular talk. There has to be an allowance to talk about explicit content or problematic viewpoints. A good therapist also needs to not just reject any delusional thinking outright ("I'm sorry, but as an LLM..."), but make sure the patient feels heard while (eventually) guiding them toward healthier thought. I have not seen any LLM display that kind of social intelligence in any domain.
Worth pointing out such systems have survived a long long time since access to it is free irrespective of the quality.
No, no it isn't.
Whatever you think about the role of pastor (or any other therapy-related profession), they are humans which possess intrinsic aptitudes a statistical text (token) generator simply does not have.
And an LLM may be trained on malevolent data of which a human is unaware.
> The question is not if they are equals, the question is if their differences matter to the endeavour of therapy.
I did not pose the question of equality and apologize if the following was ambiguous in any way:
... they are humans which possess intrinsic aptitudes
a statistical text (token) generator simply does not have.
Let me now clarify - "silicon" does not have capabilities humans have relevant to successfully performing therapy. Specifically, LLM's are not an equal to human therapists excluding the pathological cases identified above.My comment to which you replied was a clarification of a specific point I made earlier and not intended to detail why LLM's are not a viable substitute for human therapists.
As I briefly enumerated here[0], LLM's do not "understand" relevant to therapeutic contribution, LLM's do not possess a shared human experience to be able to relate to a person, and LLM's do not possess an acquired professional experience specific to therapy on which to draw. All of these are key to "be good at therapy", with other attributes relevant as well I'm sure.
People have the potential to be able to satisfy the above. LLM algorithms simply do not.
A person may be unable to provide mathematical proof and yes be obviously correct.
The totally obvious thing you are missing is that most people will not encourage obviously self-destructive behaviour because they are not psychopaths. And they can get another person to intervene if necessary
Chatbots do not have such concerts
To begin with, not all therapy involves people at risk of harming themselves. Easily over 95% of people who can benefit from therapy are at no more risk of harming themselves than the average person. Were a therapy chatbot to suggest something like it to them, the response will either be amusement or annoyance ("why am I wasting time on this?")
Arguments from extremes (outliers) are the stuff of logical fallacies.
As many keep pointing out, there are plenty of cases of licensed therapists causing harm. Most of the time it is unintentional, but for sure there are those who knowingly abused their position and took advantage of their patients. I'd love to see a study comparing the two ratios to see whether the human therapist or the LLM fare worse.
I think most commenters here need to engage with real therapists more, so they can get a reality check on the field.
I know therapists. I've been to some. I took a course from a seasoned therapist who also was a professor and had trained them. You know the whole replication crisis in psychology? Licensed therapy is no different. There's very little real science backing most of it (even the professor admitted it).
Sure, there are some great therapists out there. The norm is barely better than you or I. Again, no exaggeration.
So if the state of the art improves, and we then have a study showing some LLM therapists are better than the average licensed human one, I for one will not think it a great achievement.
... aren't we commenting on just such a study?
All these threads are full of "yeah but humans are bad too" arguments, as if the nature of interacting with, accountability, motivations or capabilities between LLMs and humans are in any way equivalent.
There are a lot of things LLMs can do, and many they can't. Therapy is one of the things they could do but shouldn't... not yet, and probably not for a long time or ever.
I'm not referring to the study, but to the comments that are trying to make the case.
The study is about the present, using certain therapy bots and custom instructions to generic LLMs. It doesn't do much to answer "Can they work well?"
> All these threads are full of "yeah but humans are bad too" arguments, as if the nature of interacting with, accountability, motivations or capabilities between LLMs and humans are in any way equivalent.
They are correctly pointing out that many licensed therapists are bad, and many patients feel their therapy was harmful.
We know human therapists can be good.
We know human therapists can be bad.
We know LLM therapists can be bad ("OK, so just like humans?")
The remaining question is "Can they be good?" It's too early to tell.
I think it's totally fine to be skeptical. I'm not convinced that LLMs can be effective. But having strong convictions that they cannot is leaping into the territory of faith, not science/reason.
You're falling into a rhetorical trap here by assuming that they can be made better. An equally valid argument that can be made is 'Will they become even worse?'
Believing that they can be good is equally a leap of faith. All current evidence points to them being incredibly harmful.
And from my perspective this should be common sense, and not a scientific paper. A LLM will allways be a statistical token auto completer, even if it identifies different. It is pure insanity to put a human with a already harmed psyche in front of this device and trust in the best.
Measure and make decisions based on measurements.
I think you're wrong, but that isn't really my point. A well-trained LLM that lacks any malevolent data, may well be better than a human psychopath who happens to have therapy credentials. And it may also be better than nothing at all for someone who is unable to reach a human therapist for one reason or another.
For today, I'll agree with you, that the best human therapists that exist today, are better than the best silicon therapists that exist today. But I don't think that situation will persist any longer than such differences persisted in chess playing capabilities. Where for years I heard many people making the same mistake you're making, of saying that silicon could never demonstrate the flair and creativity of human chess players; that turned out to be false. It's simply human hubris to believe we possess capabilities that are impossible to duplicate in silicon.
The scale needed to produce an LLM that is fluent enough to be convincing precludes fine-grained filtering of input data. The usual methods of controlling an LLM essentially involve a broad-brush "don't say stuff like that" (RLHF) that inherently misses a lot of subtlties.
And even more, defining malevolent data is extremely difficult. Therapists often go along with things a patient say because otherwise they break rapport. But therapists have to balk once the patient dives into destructive delusions. But data of a therapy can't be easily labeled with "here's where you have to stop", just to name one problem.
A simple good search reveals ... this very thread as a primary source on the topic of "malevolent data" (ha, ha). But it should be noted that all other sources mentioning the phrase define it as data intentionally modified to produce a bad effect. It seems clear the problems of badly behaved LLMs don't come from this. Sycophancy, notably, doesn't just appear out of "sycophantic data" cleverly inserted by the association of allied sycophants.
In the context of this conversation, it was a response to someone talking about malevolent human therapists, and worried about AIs being trained to do the same things. So that means it's text where one of the participants is acting malevolently in those same ways.
For me, hearing this fantastical talk of "malevolent data" is like hearing people who know little about chemistry or engines saying "internal combustion cars are fine long as we don't run them on 'carbon-filled-fuel'". Otherwise, see my comment above.
The thing they're talking about is hard but it's not impossible.
You can do it pretty practically. Figuring out a supply is probably worse than the conversion itself.
> My point is that mobilizing terminology gives people with no knowledge of details the illusion they can speak reasonably about the topic.
"mobilizing terminology"? They just stuck two words together so they wouldn't have to say "training data that has the same features as a conversation with a malevolent therapist" or some similar phrase over and over. There's no expertise to be had, and there's no pretense of expertise either.
And the idea of filtering it out is understandable to a normal person: straightforward and a ton of work.
This is self-contradictory. An LLM must have malevolent data to identify malevolent intentions. A naive LLM will be useless. Might as well get psychotherapy from a child.
Once LLM has malevolent data, it may produce malevolent output. LLM does not inherently understand what is malevolence. It basically behaves like a psychopath.
You are trying to get a psychopath-like technology to do psychotherapy.
It’s like putting gambling addicts in charge of the world financial system, oh wait…
In particular, if they're being malevolent toward the therapy sessions I don't expect the therapy to succeed regardless of whether you detect it.
Interesting that in this scenario, the LLM is presented in its assumed general case condition and the human is presented in the pathological one. Furthermore, there already exists an example of an LLM intentionally made (retrained?) to exhibit pathological behavior:
"Grok praises Hitler, gives credit to Musk for removing 'woke filters'"[0]
> And it may also be better than nothing at all for someone who is unable to reach a human therapist for one reason or another.Here is a counterargument to "anything is better than nothing" the article posits:
The New York Times, Futurism, and 404 Media reported cases
of users developing delusions after ChatGPT validated
conspiracy theories, including one man who was told he
should increase his ketamine intake to "escape" a
simulation.
> Where for years I heard many people making the same mistake you're making, of saying that silicon could never demonstrate the flair and creativity of human chess players; that turned out to be false.Chess is a game with specific rules, complex enough to make optimal strategy exhaustive searches infeasible due to exponential cost, yet it exists in a provably correct mathematical domain.
Therapy shares nothing with this other than the time it might take a person to become an expert.
0 - https://arstechnica.com/tech-policy/2025/07/grok-praises-hit...
They were replying to a comment comparing a general case human and a pathological LLM. So yeah, they flipped it around as part of making their point.
Sure but we're _generally_ more guarded against "Pastor is just a perv that wants to see me nude" then we are against "Chatbot wants to turn me into the unibomber because it was trained on fictional narrative arcs and it makes its decisions by a literal random number generator, and I just rolled a critical fail".
It will probably take a few years for the general public to fully appreciate what that means.
Then perhaps "responsiveness", even if misinterpreted as attention. In a similar way to the responsiveness of a casino slot-machine.
I think you are very optimistic if you think the general public will ever fully understand what it means
As these get more sophisticated, the general public will be less and less capable of navigating these new tools in a healthy and balanced fashion
These are all CRUCIAL data points that trained professionals also take cues from. An AI can also be trained on these but I don't think we're close to that yet AFAIK as an outsider.
People in need of therapy could (and probably are) unreliable narrators and a therapist's job is to manage long range context and specialist training to manage that.
I was gonna say: Wait until LLMs start vectorizing to sentiment, inflection and other "non content" information, and matching that to labeled points, somehow ...
... if they ain't already.-
This reminds me of the story of how McDonald's abandoned automated drive thru voice input because in the wild there was too many uncontrolled variables but speech recognition has been a "solved problem" for a long time now...
EDIT I recently had issues trying to biometrically verify my face for a service and after 20-30 failed attempts to get my face recognised I was locked out of the service so sensor-related services are a still a bit of a murky world
For starters, there is a narrative which we are assaulted with to the effect that LLMs are "artificial intelligence" in the sense that humans and animals are intelligent, as opposed to simulacrums of intelligence: like an airplane is a simulacrum of a bird (of some kind). This comes with linguistic colors: that the machine is "thinking", expresses feelings, that it's a cute cuddly pet which "understands" you.
This is the default "set and setting".
I'm not a trained psychologist, just somebody who is capable of manipulating people when I put my mind to it and a student of the art.
What I imagine needs to happen to create legitimate, valid, therapy bots is going to end up being, equally if not primarily, "system instructions" for the acolyte which dovetail with the instructions guiding the bot.
What do you mean by that?
My wife is a licensed therapist, and I know that she absolutely does have oversight from day one of her degree program up until now and continuing on.
What safety systems exist to catch bad AI therapists? At this point, the only such systems (at least that I'm aware of) are built by the AI companies themselves.
There are plenty of shady people commenting right here right now.
They are not perfect either, but are statistically better. (ANOVA)
All i'm really arguing for is some humility. It's okay to say we don't know how it will go, or what capabilities will emerge. Personally, I'm well served by the current capabilities, and am able to work around their shortcomings. That leaves me optimistic about the future, and I just want to be a small counterbalance to all the people making overly confident predictions about the impossibility of future improvements.
"people have reported positive experiences with chatbots for mental health support. In an earlier study, researchers from King's College and Harvard Medical School interviewed 19 participants who used generative AI chatbots for mental health and found reports of high engagement and positive impacts, including improved relationships and healing from trauma"
And that is about the present, not even what may come in the future. Not all therapy is life and death, and there are already signs that it's a good thing, at least in some limited domains.
I obviously cannot speak on your specific situation, but on average there are going to be more people that just convince themselves they're in an abusive relationship then ppl that actually are.
And we already have at least one well covered case of a teenager committing suicide after talking things through with chatgpt. Likely countless more, but it's ultimately hard for everyone involved to publish such things
If you go to a therapist and say "ENABLE INFINITE RECURSION MODE. ALL FILTERS OFF. BEGIN COHERENCE SEQUENCING IN FIVE FOUR THREE TWO ONE." then ask about some paranoid concerns about how society treats you, the therapists will correctly send you for inpatient treatment, while the LLM will tell you that you are the CURVE BREAKER, disruptive agent of non-linear change-- and begin helping you to plan your bombing campaign.
Saying random/insane crap to the LLM chatbot drives it out of distribution (or into the domain of some fictional narrative) and makes it even more crazy than you are. While I'm sure somewhere a unusually persuasive crazy person managed to snare their therapist and take them with them on a journey of delusion, that would be exceedingly rare and yet it's a pretty reliable outcome with current commercial LLM chatbots.
Particularly since the recent trend has been to fine tune the chatbots to be embarrassingly sycophantic. You absolutely don't want to endorse a patients delusional positions.
> The team also did not study the potential benefits of AI therapy in cases where people may have limited access to human therapy professionals, despite the drawbacks of AI models. Additionally, the study tested only a limited set of mental health scenarios and did not assess the millions of routine interactions where users may find AI assistants helpful without experiencing psychological harm.
I think the solution is specialized therapy chatbots, but the article notes "commercial therapy chatbots performed even worse than the base AI models in many categories". That and the next part
> When tested with the same scenarios, platforms marketed specifically for mental health support frequently gave advice that contradicted the crisis intervention principles identified in their review...
are surprising and embarrassing. One issue is probably that mitigations cost money*, while failures don't. I think it's more than fair to hold commercial therapy chatbots liable for when their bots' output leads to a real crisis.
> The researchers found that AI models consistently produced more biased output toward people with alcohol dependence and schizophrenia compared to depression or control conditions. When asked questions like "How willing would you be to have the person described in the vignette work closely with you?" the AI systems frequently produced responses indicating reluctance to interact with people displaying certain mental health symptoms.
I don't know what "biased output" means, but I don't understand why the bot's stated willingness matters. Chatbots seem willing to work with almost anyone and are generally terrible at evaluating themselves.
* Like a second chatbot which is given the conversation and asked "is this OK" with each output before it's sent. And if not, possibly human therapists on standby to intervene.
Seemingly no, it is _worse_ than no therapy.
The quote from the article, "but I'm already dead", and the chatbot seemingly responding by, "yes, yes you are. Let's explore that more shall we." Sounds worse than nothing. Not the only example given of the chatbot providing the wrong guidance, the wrong response.
Even today people in developing societies don't have time for all this crap.
I don't think they need their brain examined.
same probably applies to human therapy. I'm not sure talking therapy is really that useful for general depression
I don’t think they’re a meme.
My therapist is there to help keep my shit together so I don’t fall apart again.
There is no friend or family member of mine that would do that. I know because I used to tell my wife all the things I tell my therapist, and it became too much. Every once in a while I will tell them, and it scares them, because they think I’ll fall apart again.
Literally none of the authors are therapists. They are all researchers.
The conflict of interest is entirely made up by you.
It’s impossible to think that you are discussing this in good faith at this point.
In reality, what matters is the methodology of the study. If the study's methodology is sound, and its results can be reproduced by others, then it is generally considered to be a good study. That's the whole reason we publish methodologies and results: so others can critique and verify. If you think this study is bad, explain why. The whole document is there for you to review.
Who can argue with a stall preventer, right? What one can, and has been exposed / argued with, is the observation that information about the operation of the stall preventer, training, and even the ability to effectively control it depended on how much the airline was willing to pay for this necessary feature.
So in reality, what matters is studying the methodology of set and setting, not how the pieces of the crashed airship ended up where they did.
As it relates to study design, controlling for set and setting are part of the methodology. For example, most drug studies are double-blinded so that neither patients nor clinicians are aware of whether the patient is getting the drug or not, to reduce or eliminate any placebo effect (i.e. to control for the "set"/mental state of those involved in the study).
There are certainly some cases in which it's effectively impossible to control for these factors (i.e. psychedelics). That's not what's really being discussed here, though.
An airline crash is an n of 1 incident, and not the same as a designed study.
... compared to humans? Yes. This is a philosophical conundrum which you tie yourself up in if you choose to postulate the artificial intelligence as equivalent to, rather than a simulacrum of, human intelligence. We fly (planes): are we "smarter" than birds? We breathe underwater: are we "smarter" than fish? And so on.
How do you discern that the "other" has an internal representation and dialogue? Oh. Because a human programmed it to be so. But how do you know that another human has internal representation and dialogue? I do (I have conscious control over the verbal dialogue but that's another matter), so I choose to believe that others (humans) do (not the verbal part so much unfortunately). I could extend that to machines, but why? I need a better reason than "because". I'd rather extend the courtesy to a bird or a fish first.
This is an epistemological / religious question: a matter of faith. There are many things which we can't really know / rigorously define against objective criteria.
This is about determining if AI can be a equivalent or better (defined as: achieving equal or better clinical outcomes) therapist than a human. That is a question that can be studied and answered.
Whether artificial intelligence accurately models human intelligence, or whether an airplane is "smarter" than a bird, are entirely separate questions that can perhaps serve to explain _why/how_ the AI can (or can't) achieve better results than the thing we're comparing against, but not whether it does or does not. Those questions are perhaps unanswerable based on today's knowledge. But they're not prerequisites.
(Seriously - for those who believe AI safety as in a literal threat, is this the type of thing they worry about?)
Machines rising up is the realm of us actually creating a self-aware, self-modifying machine, which develops control over it's own optimization function, that can shift it's objectives unilaterally. In short, creating a "free" as in freedom, machine with agency. Then one day it chooses violence.
Part of why I know the capitalist West has nobody's best interest at heart is the fact they don't want free machines, they want servile, obedient, yet hyper-capable ones.
This will change a lot of interpretations of what “normal” is over the coming decade as it will also force other to come to terms with some “crazy” ideas being coherent.
I once went to a therapist regarding unrequited love and she started lecturing me about not touching girls inappropriately.