VL-RouterBench: A Benchmark for Vision-Language Model Routing
Analysis
This paper introduces VL-RouterBench, a new benchmark designed to systematically evaluate Vision-Language Model (VLM) routing systems. The lack of a standardized benchmark has hindered progress in this area. By providing a comprehensive dataset, evaluation protocol, and open-source toolchain, the authors aim to facilitate reproducible research and practical deployment of VLM routing techniques. The benchmark's focus on accuracy, cost, and throughput, along with the harmonic mean ranking score, allows for a nuanced comparison of different routing methods and configurations.
Key Takeaways
- •VL-RouterBench is a new benchmark for evaluating VLM routing systems.
- •It covers 14 datasets, 15 open-source models, and 2 API models.
- •The evaluation considers accuracy, cost, and throughput.
- •An open-source toolchain will be released to promote reproducibility.
“The evaluation protocol jointly measures average accuracy, average cost, and throughput, and builds a ranking score from the harmonic mean of normalized cost and accuracy to enable comparison across router configurations and cost budgets.”