Opacus Library for Training PyTorch Models
Opacus is an AI launched by Facebook AI. It is a new high-speed library for training PyTorch models with differential privacy.
“Differential privacy is a system for publicly sharing information about a dataset by describing the patterns of groups within the dataset while withholding information about individuals in the dataset.”
Opacus, being open-source, is available for public use and is licensed under Apache-2.0. To install the latest version of Opacus, you can use pip: pip install opacus. The library has also been open-sourced on GitHub.
Other features that Opacus have to offer are listed by Marktechpost as the following:
- “Opacus can compute batched per-sample gradients (By leveraging Autograd hooks in PyTorch), resulting in a speedup by order of magnitude compared to existing differential privacy libraries that rely on micro-batching.
- Opacus have something unique to offer in the safety field too. Opacus uses a cryptographically safe pseudo-random number generator for its security-critical code, which is processed (for an entire batch of parameters) on the GPU at high speed.
- Opacus is comparatively flexible to use. Because when it comes to prototyping ideas, PyTorch makes it quick for researchers and engineers to mix and match their code with PyTorch code and pure Python code.
- When it comes to productivity, Opacus offers tutorials and some helper functions that will warn you, before the start of the training, about incompatible layers. Opacus also offers automatic refactoring mechanisms.
- Opacus keeps track of how much of your privacy budget you are spending at any given point in time. Privacy budget is a core mathematical concept in differential privacy; thus, Opacus enables real-time monitoring and early stopping. That makes it very clear how interactive Opacus is.”
The library is meant to support training and has minimal impact on training performance. In machine learning and boost research this can make it easier to work with these solutions.
The goal is apparently to preserve the privacy of each training sample and keep in view that it doesn’t have much impact on the accuracy of the final model.
Facebook AI wrote about it in a recent blog post:
This is #500daysofAI and you are reading article 463. I am writing one new article about or related to artificial intelligence every day for 500 days.