Paper#llm🔬 ResearchAnalyzed: Jan 3, 2026 16:23

DICE: A New Framework for Evaluating Retrieval-Augmented Generation Systems

Published:Dec 27, 2025 16:02
1 min read
ArXiv

Analysis

This paper introduces DICE, a novel framework for evaluating Retrieval-Augmented Generation (RAG) systems. It addresses the limitations of existing evaluation metrics by providing explainable, robust, and efficient assessment. The framework uses a two-stage approach with probabilistic scoring and a Swiss-system tournament to improve interpretability, uncertainty quantification, and computational efficiency. The paper's significance lies in its potential to enhance the trustworthiness and responsible deployment of RAG technologies by enabling more transparent and actionable system improvement.

Reference

DICE achieves 85.7% agreement with human experts, substantially outperforming existing LLM-based metrics such as RAGAS.