MoR: Dynamic Mixed-Precision Training

Published:Dec 28, 2025 06:28
1 min read
ArXiv

Analysis

This paper introduces Mixture-of-Representations (MoR), a novel framework for mixed-precision training. It dynamically selects between different numerical representations (FP8 and BF16) at the tensor and sub-tensor level based on the tensor's properties. This approach aims to improve the robustness and efficiency of low-precision training, potentially enabling the use of even lower precision formats like NVFP4. The key contribution is the dynamic, property-aware quantization strategy.

Reference

Achieved state-of-the-art results with 98.38% of tensors quantized to the FP8 format.