Phare benchmark revealed: Leading AI models deliver wrong information 30% of the time

The latest study by Giskard in collaboration with Google DeepMind shows that leading language models such as GPT-4, Claude and Llama invent facts that sound convincing but are not true up to 30% of the time. These AI hallucinations pose a growing risk to businesses and end users, especially when the models are instructed to give short, concise answers.

Read more