Case-Augmented Reasoning: A Novel Approach to Enhance LLM Safety and Reduce Over-Refusal
Analysis
Key Takeaways
- •CADA improves LLM harmlessness and robustness against attacks.
- •The method reduces over-refusal while preserving utility across diverse benchmarks.
- •Case-augmented reasoning is a practical alternative to rule-only deliberative alignment.
“By guiding LLMs with case-augmented reasoning instead of extensive code-like safety rules, we avoid rigid adherence to narrowly enumerated rules and enable broader adaptability.”