They prove that no finite amount of training data is enough to extrapolate an adversarially constructed non-continuous function. It's something akin to the no free lunch theorem (NFL).
No one uses the NFL to "prove" that LLMs can't learn to be the best optimizers, because it also proves that people can't be the best optimizers, but we manage somehow, so the theorem is irrelevant.
LLM or transformers just merely extracting signals from human text and build a "contextualized" predictor over a long sequence of words sorted by the information (technically it is attention) of each token, then generate sentences that way, one by one into other sequences at a time.
But the biggest problem is, even human itself is subjectable to hallucination. That is called being delusional, or being drugged. So it is inevitable from the first principle.
I don't think the paper is saying hallucination is limited to LLM's. The decoupling that needs to happen is the subconscious notion that LLM's are computers.
Computers are predictable and calculated (still based on a person's design), but LLM's are unreliable and unpredictable. If the general populous assumes LLM's are as trusted as a calculator, we're in for a bad time.
I hope there's a better way to prove that beyond unrealistic stress testing while speaking in absolutes, but honesty needs to come first on all sides.
Humans hallucinate, in the LLM sense, all the time. Did that sign really say that? Nope, I just extrapolated from the first three letters. In the Cambrian Explosion article on HN this morning, I thought the first line said that the earth was desolate. The second line didn't match up with that idea, so I read it again, and the first line said the opposite of what I thought. I particularly hallucinate things into emails from people at work that I disagree with, so much so that I've learned to wait until the next day to reply, and usually I find that they didn't say what I thought they said.
My intuition on this is like training a classifier on four classes: dog, cat, cow and IDK. It feels intuitive to us but really hard to do in practice.
In the classifier case, we are leveraging a subset of data to train the model to give correct answers to unseen data. If we want the model to generalize to unseen data we need it to call unseen dog-like things a dog. If not, then all unseen dogs would be IDK.
Learning that boundary of "known vs unknown" is very hard. If done poorly, you have a model that cannot abstract to anything that is not in the dataset which is a huge part of what makes these models so impressive.
I'm sure there is more to it than this but I does not surprise me that it is an unsolved problem.
Yeah, they only “proved” hallucination is inevitable by defining it to be any case where the llm doesn’t provide the “correct” answer. By this definition, an LLM deciding not to answer is also a “hallucination”.
They are lossy statistical prediction machines - to eliminate hallucinations effectively eliminates the lossy part and you might as well just use predicates in a database of facts.
They prove that no finite amount of training data is enough to extrapolate an adversarially constructed non-continuous function. It's something akin to the no free lunch theorem (NFL).
No one uses the NFL to "prove" that LLMs can't learn to be the best optimizers, because it also proves that people can't be the best optimizers, but we manage somehow, so the theorem is irrelevant.
This is a fallacy of proving too much.
>Submitted on 22 Jan 2024 (v1), last revised 13 Feb 2025 (this version, v2)
LLM or transformers just merely extracting signals from human text and build a "contextualized" predictor over a long sequence of words sorted by the information (technically it is attention) of each token, then generate sentences that way, one by one into other sequences at a time.
But the biggest problem is, even human itself is subjectable to hallucination. That is called being delusional, or being drugged. So it is inevitable from the first principle.
I don't think the paper is saying hallucination is limited to LLM's. The decoupling that needs to happen is the subconscious notion that LLM's are computers.
Computers are predictable and calculated (still based on a person's design), but LLM's are unreliable and unpredictable. If the general populous assumes LLM's are as trusted as a calculator, we're in for a bad time.
I hope there's a better way to prove that beyond unrealistic stress testing while speaking in absolutes, but honesty needs to come first on all sides.
Humans hallucinate, in the LLM sense, all the time. Did that sign really say that? Nope, I just extrapolated from the first three letters. In the Cambrian Explosion article on HN this morning, I thought the first line said that the earth was desolate. The second line didn't match up with that idea, so I read it again, and the first line said the opposite of what I thought. I particularly hallucinate things into emails from people at work that I disagree with, so much so that I've learned to wait until the next day to reply, and usually I find that they didn't say what I thought they said.
From that abstract it doesn't sound like they allowed for the possibility that the LLM could be trained to say "I don't know" for some things.
My intuition on this is like training a classifier on four classes: dog, cat, cow and IDK. It feels intuitive to us but really hard to do in practice. In the classifier case, we are leveraging a subset of data to train the model to give correct answers to unseen data. If we want the model to generalize to unseen data we need it to call unseen dog-like things a dog. If not, then all unseen dogs would be IDK. Learning that boundary of "known vs unknown" is very hard. If done poorly, you have a model that cannot abstract to anything that is not in the dataset which is a huge part of what makes these models so impressive. I'm sure there is more to it than this but I does not surprise me that it is an unsolved problem.
Yeah, they only “proved” hallucination is inevitable by defining it to be any case where the llm doesn’t provide the “correct” answer. By this definition, an LLM deciding not to answer is also a “hallucination”.
IANAL, nor expert in this space.
But might any such care to comment on the consequences, if this "it is impossible, even in theory, to eliminate LLM hallucinations" result holds up?
They are lossy statistical prediction machines - to eliminate hallucinations effectively eliminates the lossy part and you might as well just use predicates in a database of facts.
Describing it as a limitation is the problem. Hallucination is the core feature. It's the only thing they do!