Analysis
Exciting news! A new benchmark highlights significant advancements in how well LLMs can use large amounts of text. Claude Opus 4.6 demonstrated impressive performance, showing that these models are getting better at retaining and using information within extended contexts.
Key Takeaways
Reference / Citation
View Original"Opus 4.6 scores 76%, whereas Sonnet 4.5 scores just 18.5%. This is a qualitative shift in how much context a model can actually use while maintaining peak performance."