Hallucinations
When LLMs confidently state things that aren't true, and why it's a fundamental problem
What it is
Hallucination refers to an LLM generating factually incorrect information with apparent confidence. The model doesn't flag its uncertainty, it produces false citations, wrong dates, invented names, and incorrect facts with the same fluency as correct ones.
The root cause is in the training objectives: during pre-training, models are rewarded for predicting the correct next token but not specifically penalized for confident wrong guesses. The model learns to generate plausible-sounding text, and "plausible" doesn't mean "verified."
Hallucination rates vary by domain (better on common knowledge, worse on niche facts), model size, and whether the model has access to tools like web search.