Scaling Model Training with Kubernetes at Stripe with Kelley Rivoire - TWIML Talk #272

Technology#Machine Learning Infrastructure📝 Blog|Analyzed: Dec 29, 2025 08:13
Published: Jun 6, 2019 16:34
1 min read
Practical AI

Analysis

This article summarizes a podcast episode featuring Kelley Rivoire, an engineering manager at Stripe, discussing their machine learning infrastructure. The conversation focuses on scaling model training using Kubernetes. The discussion covers Stripe's journey, starting with a production focus, and the internal tools they developed, such as Railyard, an API designed for managing model training at scale. The article highlights the practical aspects of implementing and managing machine learning infrastructure within a large organization like Stripe, offering insights into their approach to resource management and API design for model training.
Reference / Citation
View Original
"The article doesn't contain a direct quote, but summarizes the topics discussed."
P
Practical AIJun 6, 2019 16:34
* Cited for critical analysis under Article 32.