[Model Release] Genesis-152M-Instruct: Exploring Hybrid Attention + TTT at Small Scale

Research #llm 📝 Blog|Analyzed: Dec 27, 2025 04:31•

Published: Dec 26, 2025 17:23

•

1 min read

Analysis

This article announces the release of Genesis-152M-Instruct, a small language model designed for research purposes. It focuses on exploring the interaction of recent architectural innovations like GLA, FoX, TTT, µP, and sparsity within a constrained data environment. The key question addressed is how much architectural design can compensate for limited training data at a 150M parameter scale. The model combines several ICLR 2024-2025 ideas and includes hybrid attention, test-time training, selective activation, and µP-scaled training. While benchmarks are provided, the author emphasizes that this is not a SOTA model but rather an architectural exploration, particularly in comparison to models trained on significantly larger datasets.