Assessing Data Quality at Shopify with Wendy Foster - #592
Analysis
This article from Practical AI discusses data quality at Shopify, focusing on the work of Wendy Foster, a director of engineering & data science. The conversation highlights the data-centric approach versus model-centric approaches, emphasizing the importance of data coverage and freshness. It also touches upon data taxonomy, challenges in large-scale ML model production, future use cases, and Shopify's new ML platform, Merlin. The article provides insights into how a major e-commerce platform like Shopify manages and leverages data for its merchants and product data.
Key Takeaways
- •Data-centric vs. model-centric approaches are discussed in the context of Shopify.
- •Data quality, including coverage and freshness, is a key focus.
- •Shopify utilizes data to assist vendors and is developing ML platforms like Merlin.
“We discuss how they address, maintain, and improve data quality, emphasizing the importance of coverage and “freshness” data when solving constantly evolving use cases.”