Which AI models are more likely to hallucinate?

snitzoid
Dec 14, 2025
1 min read

Ironcially, as you probably have noticed I most frequently use ClaudeAi to do research for the Report.

Sorry, I meant to "fabricate data".

AI Is Costing Enterprises BILLIONS Because of AI Hallucinations

Visual Capitalist

Last month, more than 50 US court cases cited fake AI-generated legal authorities. Lawyers lost cases, got suspended, and had to notify courts, all because their AI confidently invented answers that looked real. Legal tech is just the tip of the iceberg.

Across industries, Nearly half of executives admit making big decisions based on made-up content, and 70% of AI projects fail, not because AI is bad, but because companies pick the wrong models for their use case. Total losses reached more than 67 billion dollars in 2024 alone.

Air Canada's chatbot lied about refund policies
Google's Gemini hallucinated fake lawsuits about a solar company
50+ lawyers sanctioned in one month for deploying ChatGPT as a legal research tool
Deloitte submitted government reports with fabricated AI-generated citations

High-accuracy models like GPT-5 and Gemini are impressive but hallucinate with conviction, which is terrible for production, compliance, or safety-critical applications. Reliable models like Claude trade some capability for the kind of trust enterprises actually need.

According to Artificial Analysis's Omniscience Index:

Reliable Models Trade Power for Trust

Claude 4.1 Opus: 36% accuracy, 48% hallucination rate (lowest in industry)
Claude 4.5 Sonnet: 31% accuracy, 48% hallucination rate
Grok 4: 39% accuracy, 64% hallucination rate

The models winning enterprise contracts aren't the smartest ones. They're the trustworthy ones.

Which AI models are more likely to hallucinate?

Recent Posts

Comments