LLMs Reveal Long-Range Structure in English
Analysis
Key Takeaways
- •LLMs reveal long-range dependencies in English text.
- •Conditional entropy decreases with context length up to 10,000 characters.
- •Long-range structure is learned gradually during LLM training.
- •Findings constrain statistical physics models of LLMs and language.
“The conditional entropy or code length in many cases continues to decrease with context length at least to $N\sim 10^4$ characters, implying that there are direct dependencies or interactions across these distances.”