Search:
Match:
1 results

Analysis

This paper introduces a novel, training-free framework (CPJ) for agricultural pest diagnosis using large vision-language models and LLMs. The key innovation is the use of structured, interpretable image captions refined by an LLM-as-Judge module to improve VQA performance. The approach addresses the limitations of existing methods that rely on costly fine-tuning and struggle with domain shifts. The results demonstrate significant performance improvements on the CDDMBench dataset, highlighting the potential of CPJ for robust and explainable agricultural diagnosis.
Reference

CPJ significantly improves performance: using GPT-5-mini captions, GPT-5-Nano achieves +22.7 pp in disease classification and +19.5 points in QA score over no-caption baselines.