Streaming giant Netflix has revealed how it is making the most of the versatile programming language Python.
The company has detailed the ways it uses Python, one of the world’s fastest growing languages, for everything from operations management and analysis through to security and networking.
Netflix relies on a mix of well-known packages and in-house software libraries, with Python seemingly used in nearly every corner of the business, which is largely run on the Amazon Web Services (AWS) cloud platform.
“We use Python through the full content lifecycle, from deciding which content to fund all the way to operating the CDN that serves the final video to 148 million members,” write Netflix engineers in a blog post.
SEE: Python is eating the world: How one developer’s side project became the hottest programming language on the planet (cover story PDF) (TechRepublic)
If you’re interested in finding out more about Python, check out TechRepublic’s guide to free resources for learning Python and this round-up of the best Python guides and code examples on GitHub.
Here’s how Netflix uses Python.
Operations
Neftlix’s demand engineering team build resiliency into the network by providing regional failovers and orchestrating the distribution of Netflix’s traffic.
“We are proud to say that our team’s tools are built primarily in Python,” the team writes.
“The ability to drop into a bpython shell and improvise has saved the day more than once.”
Tools used by the team include:
- NumPy and SciPy to perform numerical analysis
- Boto3 to make changes to AWS infrastructure
- rq to run asynchronous workloads
- Flask APIs are used as a wrapper around the orchestration tools above.
- Jupyter Notebooks and nteract are used to analyze operational data and prototype visualization tools. Neflix uses Python to build custom extensions to the Jupyter server that allows engineers to manage tasks like logging, archiving, publishing and cloning notebooks.
Meanwhile, the big data orchestration team provide services and tooling for scheduling and executing ETL (Extract, Transform, Load) of data and adhoc data pipelines.
The team use Jupyter Notebooks with papermill to allow the scheduler to provide templatized job types, for example Spark.
Also used is pygenie, a Netflix-built client that interfaces with Genie, a federated job execution service.
Statistical analysis
Netflix’s CORE team uses many Python statistical and mathematical libraries, also including NumPy, SciPy, ruptures, and Pandas, which help analyse thousands of signals after an alert.
Python has also been used to develop a time series correlation system, as well as a distributed worker system to parallelize large analytic workloads.
On top of that, Python is also typically used for automation tasks, data exploration and cleaning, and visualization.
Monitoring and automated response
Netflix’s Insight Engineering team is responsible for building and operating the tools for generating alerts, diagnostics, and automatic remediation.
They now support Python clients for most of their services, including the Spectator Python client library, a library for recording dimensional, time-series metrics.
The Python frameworks Gunicorn, Flask, Flask-RESTPlus were also used to create Netflix’s Winston and Bolt diagnostic and remediation platforms.
Security
Netflix’s information security team uses Python for a wide variety of tasks, including security automation, risk classification, auto-remediation, and vulnerability identification.
Python projects include:
- Security Monkey– an open-source Netflix library for monitoring AWS, Google Cloud Platform, OpenStack, and GitHub for changes to assets.
- The Bless SSH Certificate Authority to protect SSH resources.
- Repokid allows Python to be used to help with IAM (Identity and Access Management) permission tuning.
- Lemur is used to help generate TLS certificates and Netflix also uses the Diffy forensics triage tool, which it built entirely using Python.
Machine learning
Netflix relies on Python extensively when training machine learning models it uses for everything from recommendation algorithms to artwork personalization to marketing algorithms.
Some algorithms use TensorFlow, Keras, and PyTorch when training deep neural networks, while XGBoost and LightGBM are used to build Gradient Boosted Decision Trees.
Netflix also uses the broader scientific stack in Python, such as NumPy, SciPy, scikit-learn, Matplotlib, Pandas and cvxpy.
Metaflow, a Python framework that makes it easy to execute ML projects from the prototype stage to production, is used across the company at scale. With Metaflow, Netflix relies on well parallelized and optimized Python code to fetch data at 10Gbps, handling hundreds of millions of data points in memory, and orchestrating computation over tens of thousands of CPU cores.
Jupyter Notebooks are also used for working up new experiments.
Experimentation
Netflix’s scientific computing team for experimentation provides a platform for scientists and engineers to analyze AB tests and other experiments.
Among the Python frameworks they use are:
The Metrics Repo, a Python framework based on PyPika that allows users to write reusable, parameterized SQL queries.
The Causal Models library, a Python and R framework, which uses PyArrow and RPy2, and allows scientists to contribute new models for causal inference.
Meanwhile Netflix’s visualizations library is based on Plotly.
Video encoding and automated content analysis
Netflix has a team dedicated to encoding the Netflix catalog and using machine learning to analyse it, for example to extract the best stills from a movie.
Among the around 50 projects where Python is used are the video quality evaluation library vmaf and the mezzfs library for mounting content from cloud object storage as local files.