Gemma Scope 2 发布

Research #llm 📝 Blog|分析: 2026年1月3日 07:50•

发布: 2025年12月22日 21:56

•

2分で読める

分析

Google DeepMind 的 mech interp 团队正在发布 Gemma Scope 2，这是一套在 Gemma 3 模型家族上训练的稀疏自编码器 (SAE) 和转码器。此次发布比之前的版本有所改进，包括支持更复杂的模型、涵盖所有层和高达 27B 模型尺寸的更全面的发布，以及对聊天模型的关注。该版本包括在不同站点（残差流、MLP 输出和注意力输出）上训练的 SAE 和 MLP 转码器。尽管团队已不再优先研究 SAE 的基础研究，但他们希望这能成为社区的有用工具。

要点

引用 / 来源

查看原文

"The release contains SAEs trained on 3 different sites (residual stream, MLP output and attention output) as well as MLP transcoders (both with and without affine skip connections), for every layer of each of the 10 models in the Gemma 3 family (i.e. sizes 270m, 1b, 4b, 12b and 27b, both the PT and IT versions of each)."

Alignment Forum2025年12月22日 21:56

* 根据版权法第32条进行合法引用。

较旧

Can we interpret latent reasoning using current mechanistic interpretability tools?

较新

AI Safety Newsletter #63: California’s SB-53 Passes the Legislature

Gemma Scope 2 发布

分析

要点

相关分析

人类AI检测

侧重于实现的深度学习书籍

个性化 Gemini

📬 获取AI新闻

按类别浏览

热门话题

📬 获取AI新闻

按类别浏览

热门话题