LLMs Reveal Long-Range Structure in English
Analysis
This paper investigates the long-range dependencies in English text using large language models (LLMs). It's significant because it challenges the assumption that language structure is primarily local. The findings suggest that even at distances of thousands of characters, there are still dependencies, implying a more complex and interconnected structure than previously thought. This has implications for how we understand language and how we build models that process it.
Key Takeaways
- •LLMs reveal long-range dependencies in English text.
- •Conditional entropy decreases with context length up to 10,000 characters.
- •Long-range structure is learned gradually during LLM training.
- •Findings constrain statistical physics models of LLMs and language.
“The conditional entropy or code length in many cases continues to decrease with context length at least to $N\sim 10^4$ characters, implying that there are direct dependencies or interactions across these distances.”