Scaling Model Training with Kubernetes at Stripe with Kelley Rivoire - TWIML Talk #272
Technology#Machine Learning Infrastructure📝 Blog|Analyzed: Dec 29, 2025 08:13•
Published: Jun 6, 2019 16:34
•1 min read
•Practical AIAnalysis
This article summarizes a podcast episode featuring Kelley Rivoire, an engineering manager at Stripe, discussing their machine learning infrastructure. The conversation focuses on scaling model training using Kubernetes. The discussion covers Stripe's journey, starting with a production focus, and the internal tools they developed, such as Railyard, an API designed for managing model training at scale. The article highlights the practical aspects of implementing and managing machine learning infrastructure within a large organization like Stripe, offering insights into their approach to resource management and API design for model training.
Key Takeaways
- •Stripe's approach to scaling model training using Kubernetes.
- •The development and use of internal tools like Railyard for managing model training.
- •The focus on production and practical implementation of machine learning infrastructure.
Reference / Citation
View Original"The article doesn't contain a direct quote, but summarizes the topics discussed."