Non-determinism in GPT-4 is caused by Sparse MoE

Research #llm 👥 Community|Analyzed: Jan 3, 2026 06:23•

Published: Aug 4, 2023 21:37

•

1 min read

Analysis

The article claims that the non-deterministic behavior of GPT-4 is due to its Sparse Mixture of Experts (MoE) architecture. This suggests that the model's output varies even with the same input, potentially due to the probabilistic nature of expert selection or the inherent randomness within the experts themselves. This is a significant observation as it impacts the reproducibility and reliability of GPT-4's outputs.