Gradient-Based Optimization for Large Neural Network Models

Published:Dec 30, 2025 15:35
1 min read
ArXiv

Analysis

This paper addresses the computational challenges of optimizing nonlinear objectives using neural networks as surrogates, particularly for large models. It focuses on improving the efficiency of local search methods, which are crucial for finding good solutions within practical time limits. The core contribution lies in developing a gradient-based algorithm with reduced per-iteration cost and further optimizing it for ReLU networks. The paper's significance is highlighted by its competitive and eventually dominant performance compared to existing local search methods as model size increases.

Reference

The paper proposes a gradient-based algorithm with lower per-iteration cost than existing methods and adapts it to exploit the piecewise-linear structure of ReLU networks.