Introduction
Scaling ML models
()
What you should know
()
1. The Need to Scale ML Models
Building and running ML models for data scientists
()
Building and deploying ML models for production use
()
Definition of scaling ML for production
()
Overview of tools and techniques for scalable ML
()
2. Design Patterns for Scalable ML Applications
Horizontal vs. vertical scaling
()
Running models as services
()
APIs for ML model services
()
Load balancing and clusters of servers
()
Scaling horizontally with containers
()
3. Deploying ML Models as Services
Services encapsulate ML models
()
Using Plumber to create APIs for R programs
()
Using Flask to create APIs for Python programs
()
Best practices for API design for ML services
()
4. Running ML Services in Containers
Containers bundle ML model components
()
Introduction to Docker
()
Building Docker images with Dockerfiles
()
Example Docker build process
()
Using Docker registries to manage images
()
5. Scaling ML Services with Kubernetes
Running services in clusters
()
Introduction to Kubernetes
()
Creating a Kubernetes cluster
()
Deploying containers in a Kubernetes cluster
()
Scaling up a Kubernetes cluster
()
Autoscaling a Kubernetes cluster
()
6. ML Services in Production
Monitoring service performance
()
Service performance data
()
Docker container monitoring
()
Kubernetes monitoring
()
Conclusion
Best practices for scaling ML
()
Next steps
()
Ex_Files_Scalable_ML_Data.zip
(1.0 MB)