The Crucial Scatter Plot Trap: Why Visual Tightness Doesn't Always Mean Stronger Correlation
research#eda📝 Blog|Analyzed: Apr 27, 2026 08:56•
Published: Apr 27, 2026 08:41
•1 min read
•r/learnmachinelearningAnalysis
This is a brilliantly insightful breakdown of a common visual pitfall in data science that can easily lead to flawed feature selection during Exploratory Data Analysis. It provides a fantastic reminder of the underlying mathematics of Pearson's r and how it standardizes scale, challenging our intuitive visual assumptions. The author's decision to create a video demonstration offers a highly engaging way to build better, more rigorous analytical workflows.
Key Takeaways
- •Visually tighter scatter plots do not necessarily represent stronger correlations than looser-looking ones.
- •Pearson's r operates on relative clustering by dividing deviations by the standard deviation rather than using raw units.
- •Failing to understand this scale standardization during EDA can lead to mistakenly deprioritizing highly correlated features.
Reference / Citation
View Original"Pearson's r standardizes away scale entirely, so on a shared axis, a dataset with smaller SDs looks more compact but can have identical correlation."