Introducing Unsloth Studio ✨ A new open-source web UI to train and run LLMs.
• Run models locally on Mac, Windows, Linux • Train 500+ models 2x faster with 70% less VRAM • Supports GGUF, vision, audio, embedding models • Auto-create datasets from PDF, CSV, DOCX • Self-healing tool calling and code execution • Compare models side by side + export to GGUF
Want to know which AI models are least likely to hallucinate — and how to keep yours from spiking hallucinations by 20%?
A new benchmark called Phare, by Giskard, tested leading models across multiple languages, revealing three key findings:
1️⃣ Popular models aren't necessarily factual. Some models ranking highest in user satisfaction benchmarks like LMArena are actually more prone to hallucination.
2️⃣ The way you ask matters - a lot. When users present claims confidently ("My teacher said..."), models are 15% less likely to correct misinformation vs. neutral framing ("I heard...").
3️⃣ Telling models to "be concise" can increase hallucination by up to 20%.
What's also cool is that the full dataset is public - use them to test your own models or dive deeper into the results! H/t @davidberenstein1957 for the link.