JudgeBench: A Benchmark for Evaluating LLM-based Judges

The International Conf. on Learning Representations (ICLR 2025), 2025-01-19 00:00:00 -0800