We have to stop ignoring AI’s hallucination problem

Technology@lemmy.world – 496 points – 1 months ago

We have to stop ignoring AI’s hallucination problem

You are viewing a single comment

Why do tech journalists keep using the businesses' language about AI, such as "hallucination", instead of glitching/bugging/breaking?

hallucination refers to a specific bug (AI confidently BSing) rather than all bugs as a whole

Honestly, it's the most human you'll ever see it act.

It's got upper management written all over it.

(AI confidently BSing)

Isn't it more accurate to say it's outputting incorrect information from a poorly processed prompt/query?

No, because it's not poorly processing anything. It's not even really a bug. It's doing exactly what it's supposed to do, spit out words in the "shape" of an appropriate response to whatever was just said

When I wrote "processing", I meant it in the sense of getting to that "shape" of an appropriate response you describe. If I'd meant this in a conscious sense I would have written, "poorly understood prompt/query", for what it's worth, but I see where you were coming from.

It's not a bug, it's a natural consequence of the methodology. A language model won't always be correct when it doesn't know what it is saying.

Yeah, on further thought and as I mention in other replies, my thoughts on this are shifting toward the real bug of this being how it's marketed in many cases (as a digital assistant/research aid) and in turn used, or attempted to be used (as it's marketed).

I agree, it's a massive issue. It's a very complex topic that most people have no way of understanding. It is superb at generating text, and that makes it look smarter than it actually is, which is really dangerous. I think the creators of these models have a responsibility to communicate what these models can and can't do, but unfortunately that is not profitable.

it never knows what it's saying

That was what I was trying to say, I can see that the wording is ambiguous.

Oh, at some point it will lol

Because hallucinations pretty much exactly describes what's happening? All of your suggested terms are less descriptive of what the issue is.

The definition of hallucination:

A hallucination is a perception in the absence of an external stimulus.

In the case of generative AI, it's generating output that doesn't match it's training data "stimulus". Or in other words, false statements, or "facts" that don't exist in reality.

perception

This is the problem I take with this, there's no perception in this software. It's faulty, misapplied software when one tries to employ it for generating reliable, factual summaries and responses.

I have adopted the philosophy that human brains might not be as special as we've thought, and that the untrained behavior emerging from LLMs and image generators is so similar to human behaviors that I can't help but think of it as an underdeveloped and handicapped mind.

I hypothesis that a human brain, who's only perception of the world is the training data force fed to it by a computer, would have all the same problems the LLMs do right now.

To put it another way... The line that determines what is sentient and not is getting blurrier and blurrier. LLMs have surpassed the Turing test a few years ago. We're simulating the level of intelligence of a small animal today.

https://en.m.wikipedia.org/wiki/Hallucination_(artificial_intelligence)

The term "hallucinations" originally came from computer researchers working with image producing AI systems. I think you might be hallucinating yourself 😉

Fun part is, that article cites a paper mentioning misgivings with the terminology: AI Hallucinations: A Misnomer Worth Clarifying. So at the very least I'm not alone on this.

Ty. As soon as I saw the headline, I knew I wouldn't be finding value in the article.

It's not a bad article, honestly, I'm just tired of journalists and academics echoing the language of businesses and their marketing. "Hallucinations" aren't accurate for this form of AI. These are sophisticated generative text tools, and in my opinion lack any qualities that justify all this fluff terminology personifying them.

Also frankly, I think students have one of the better applications for large-language model AIs than many adults, even those trying to deploy them. Students are using them to do their homework, to generate their papers, exactly one of the basic points of them. Too many adults are acting like these tools should be used in their present form as research aids, but the entire generative basis of them undermines their reliability for this. It's trying to use the wrong tool for the job.

You don't want any of the generative capacities of a large-language model AI for research help, you'd instead want whatever text-processing it may be able to do to assemble and provide accurate output.