Search:
Match:
2 results
Research#llm📝 BlogAnalyzed: Jan 3, 2026 07:50

Gemma Scope 2 Release Announced

Published:Dec 22, 2025 21:56
2 min read
Alignment Forum

Analysis

Google DeepMind's mech interp team is releasing Gemma Scope 2, a suite of Sparse Autoencoders (SAEs) and transcoders trained on the Gemma 3 model family. This release offers advancements over the previous version, including support for more complex models, a more comprehensive release covering all layers and model sizes up to 27B, and a focus on chat models. The release includes SAEs trained on different sites (residual stream, MLP output, and attention output) and MLP transcoders. The team hopes this will be a useful tool for the community despite deprioritizing fundamental research on SAEs.

Key Takeaways

Reference

The release contains SAEs trained on 3 different sites (residual stream, MLP output and attention output) as well as MLP transcoders (both with and without affine skip connections), for every layer of each of the 10 models in the Gemma 3 family (i.e. sizes 270m, 1b, 4b, 12b and 27b, both the PT and IT versions of each).

safety#llm🏛️ OfficialAnalyzed: Jan 5, 2026 10:16

Gemma Scope 2: Enhanced Interpretability for Safer AI

Published:Dec 16, 2025 10:14
1 min read
DeepMind

Analysis

The release of Gemma Scope 2 significantly lowers the barrier to entry for researchers investigating the inner workings of the Gemma family of models. By providing open interpretability tools, DeepMind is fostering a more collaborative and transparent approach to AI safety research, potentially accelerating the discovery of vulnerabilities and biases. This move could also influence industry standards for model transparency.
Reference

Open interpretability tools for language models are now available across the entire Gemma 3 family with the release of Gemma Scope 2.