Case study
Netflix
Streaming & original content
Overview
Netflix couples globally distributed playback with heavy personalization: rows, artwork, and rankings adapt per profile and region. Engineering students often see “a video app”; at scale it is data + delivery + device matrix under strict QoE goals.
The company is also known for chaos engineering and cell-based architectures—resilience ideas that translate well to classroom discussions on failure injection and blast radius.
Technical problems at scale
Recommendation and ranking under latency SLOs
Home and detail surfaces call ranking services with tight budgets. Caching, approximate precompute, and graceful fallbacks keep playback paths hot even when ML tiers degrade.
Open Connect and edge footprint
ISP-hosted appliances and massive edge caches reduce origin load. Cache fill, eviction, and manifest signing interact with DRM and regional compliance.
Encoding ladders and per-title optimization
ABR needs multiple renditions per asset; complexity analysis and per-shot tuning trade storage cost against rebuffering on constrained networks.
Playback on fragmented devices
Smart TVs, consoles, and old mobile OS versions mean codec support, DRM, and app rollout strategies differ by cohort—classic long tail client engineering.
Systems & patterns you will hear about
- Personalization & ML serving
- Global CDN (Open Connect)
- HLS / DASH & ABR
- Microservices / cells
- Chaos & resilience testing
Case-study angles
Sketch what happens when the home-ranking service is slow but the playback manifest service is healthy—what does the client show, and for how long?
Compare pushing new encodes to the edge vs invalidating stale segments during a bad deploy.