An Overview of the PyTorch Ecosystem
PyTorch has been rapidly catching up to TensorFlow as a go-to framework for AI research. PyTorch is a machine learning library that Facebook AI Research Lab (FAIR) developed. It was first introduced in 2016 and is distributed on the BSD license as free, open-source software. Since TensorFlow was released by Google a couple years prior, PyTorch was always a little behind in adoption by researchers and engineers, but this has been changing as more researchers are finding PyTorch much easier to work with.
That being said, this is not an article that will debate the merits of one framework over the other, but rather will focus on PyTorch and the community that has grown up quickly around it.
As the PyTorch community continues to grow, and the use of the PyTorch framework finds its way into every industry, PyTorch users have developed a rich ecosystem of tools, libraries, and more to support, accelerate, and explore AI development.
There are many out there, but in this article we will quickly go over a selection of 50 we find really interesting, along with links to their GitHub repos or websites.
- PyTorch Lightning: PyTorch Lighting is a lightweight PyTorch wrapper for high-performance AI research that aims to abstract Deep Learning boilerplate while providing you full control and flexibility over your code. With Lightning, you scale your models not the boilerplate.
- PopTorch: The PopTorch interface library is a simple wrapper for running PyTorch programs directly on Graphcore IPUs.
- BoTorch: BoTorch is a library for Bayesian Optimization. It provides a modular, extensible interface for composing Bayesian optimization primitives.
- Albumentations: Fast and extensible image augmentation library for different CV tasks like classification, segmentation, object detection and pose estimation.
- DeepSpeed: DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
- MONAI: MONAI provides domain-optimized foundational capabilities for developing healthcare imaging training workflows.
- Flair: Flair is a very simple framework for state-of-the-art natural language processing (NLP).
- Lightly: Lightly is a computer vision framework for self-supervised learning.
- NeMo: NVIDIA NeMo is a conversational AI toolkit built for researchers working on automatic speech recognition (ASR), natural language processing (NLP), and text-to-speech synthesis (TTS). The primary objective of NeMo is to help researchers from industry and academia to reuse prior work (code and pretrained models and make it easier to create new conversational AI models.
- TextBrewer: A PyTorch-based knowledge distillation toolkit for natural language processing.
- OpenMMLab: OpenMMLab covers a wide range of computer vision research topics including classification, detection, segmentation, and super-resolution.
- PyKale: PyKale is a PyTorch library for multimodal learning and transfer learning with deep learning and dimensionality reduction on graphs, images, texts, and videos.
- fastai: fastai is a library that simplifies training fast and accurate neural nets using modern best practices.
- CrypTen: CrypTen is a framework for Privacy Preserving ML. Its goal is to make secure computing techniques accessible to ML practitioners.
- Glow: Glow is a ML compiler that accelerates the performance of deep learning frameworks on different hardware platforms.
- Horovod: Horovod is a distributed training library for deep learning frameworks. Horovod aims to make distributed DL fast and easy to use.
- Determined: Determined is a platform that helps deep learning teams train models more quickly, easily share GPU resources, and effectively collaborate.
- pytorchfi: A runtime fault injection tool for PyTorch.
- FairScale: FairScale is a PyTorch extension library for high performance and large-scale training on one or multiple machines/nodes.
- ParlAI: ParlAI is a unified platform for sharing, training, and evaluating dialog models across many tasks.
- VISSL: A library for state-of-the-art self-supervised learning.
- Pyro: Pyro is a universal probabilistic programming language (PPL) written in Python and supported by PyTorch on the backend.
- Clear ML: ClearML is a full system ML / DL experiment manager, versioning and ML-Ops solution.
- pystiche: pystiche is a framework for Neural Style Transfer (NST) built upon PyTorch.
- TorchIO: TorchIO is a set of tools to efficiently read, preprocess, sample, augment, and write 3D medical images in deep learning applications written in PyTorch.
- torchdrug: A powerful and flexible machine learning platform for drug discovery.
- skorch: skorch is a high-level library for PyTorch that provides full scikit-learn compatibility.
- Hydra: A framework for elegantly configuring complex applications.
- Ensemble-Pytorch: A unified ensemble framework for PyTorch to improve the performance and robustness of your deep learning model.
- PySyft: PySyft is a Python library for encrypted, privacy-preserving deep learning.
- AdverTorch: A toolbox for adversarial robustness research. It contains modules for generating adversarial examples and defending against attacks.
- einops: Flexible and powerful tensor operations for readable and reliable code.
- flower: A friendly Federated Learning Framework.
- MMF: A modular framework for vision & language multimodal research from Facebook AI Research (FAIR).
- Kornia: Kornia is a differentiable computer vision library that consists of a set of routines and differentiable modules to solve generic CV problems.
- Catalyst: Catalyst helps you write compact, but full-featured deep learning and reinforcement learning pipelines with a few lines of code.
- AdaptDL: AdaptDL is a resource-adaptive deep learning training and scheduling framework.
- Ray: Ray is a fast and simple framework for building and running distributed applications.
- ONNX Runtime: ONNX Runtime is a cross-platform inferencing and training accelerator.
- DGL: Deep Graph Library (DGL) is a Python package built for easy implementation of graph neural network model family, on top of PyTorch and other frameworks.
- GPyTorch: GPyTorch is a Gaussian process library implemented using PyTorch, designed for creating scalable, flexible Gaussian process models.
- PyTorchVideo: A deep learning library for video understanding research. Hosts various video-focused models, datasets, training pipelines and more.
- AllenNLP: AllenNLP is an open-source research library built on PyTorch for designing and evaluating deep learning models for NLP.
- TorchMetrics: Machine learning metrics for distributed, scalable PyTorch applications (from the PyTorch Lightning team).
- raster-vision: An open source framework for deep learning on satellite and aerial imagery.
- Transformers: State-of-the-art Natural Language Processing for PyTorch.
- Detectron2: Detectron2 is FAIR's next-generation platform for object detection and segmentation.
- torchgeo: Datasets, transforms, and models for geospatial data.
- PyTorch-NLP: Basic Utilities for PyTorch Natural Language Processing (NLP).
- PyTorch3D: PyTorch3D provides efficient, reusable components for 3D Computer Vision research with PyTorch.
- Hummingbird: Hummingbird compiles trained ML models into tensor computation for faster inference.
These are all extremely useful tools to help you focus on your research, rather than engineering a solution to try and get the results you want. The PyTorch website is an essential resource for any researcher, and you can click here for a full list of the PyTorch ecosystem that includes additional tools we didn’t cover here.