SPM: Efficient Linear Transformations for Neural Networks

Published:Dec 30, 2025 00:03
1 min read
ArXiv

Analysis

This paper introduces Stagewise Pairwise Mixers (SPM) as a more efficient and structured alternative to dense linear layers in neural networks. By replacing dense matrices with a composition of sparse pairwise-mixing stages, SPM reduces computational and parametric costs while potentially improving generalization. The paper's significance lies in its potential to accelerate training and improve performance, especially on structured learning problems, by offering a drop-in replacement for a fundamental component of many neural network architectures.

Reference

SPM layers implement a global linear transformation in $O(nL)$ time with $O(nL)$ parameters, where $L$ is typically constant or $log_2n$.