view post Post 526 AutoBench 1.0 is live. The Collective-LLM-as-a-Judge model benchmarkhttps://huggingface.co/blog/PeterKruger/autobench See translation
Article 1 Introducing AutoBench 2.0: Our New Benchmarking Platform is Out Just in Time to Evaluate GPT 5.2.