Improved Balanced Classification with Novel Loss Functions
Analysis
This paper addresses the challenge of class imbalance in multi-class classification, a common problem in machine learning. It introduces two new families of surrogate loss functions, GLA and GCA, designed to improve performance in imbalanced datasets. The theoretical analysis of consistency and the empirical results demonstrating improved performance over existing methods make this paper significant for researchers and practitioners working with imbalanced data.
Key Takeaways
- •Introduces two new loss function families: Generalized Logit-Adjusted (GLA) and Generalized Class-Aware weighted (GCA) losses for balanced classification.
- •Provides a comprehensive theoretical analysis of consistency for both loss families.
- •Demonstrates that GCA losses offer stronger theoretical guarantees in imbalanced settings due to more favorable scaling of H-consistency bounds.
- •Empirical results show that both GCA and GLA losses outperform existing methods, with GLA performing slightly better overall and GCA excelling in highly imbalanced scenarios.
“GCA losses are $H$-consistent for any hypothesis set that is bounded or complete, with $H$-consistency bounds that scale more favorably as $1/\sqrt{\mathsf p_{\min}}$, offering significantly stronger theoretical guarantees in imbalanced settings.”