LLMs Struggle on Underrepresented Math Problems, Especially Geometry
Analysis
Key Takeaways
- •LLMs were evaluated on Missouri Collegiate Mathematics Competition problems.
- •DeepSeek-V3 performed best overall, but all models struggled with Geometry.
- •The study identified distinct error patterns for each LLM, highlighting areas for improvement.
“DeepSeek-V3 has the best performance in all three categories... All three LLMs exhibited notably weak performance in Geometry.”