Brevitas

Latest version: v0.11.0

Safety actively analyzes 723200 Python packages for vulnerabilities to keep your Python projects secure.

Page 2 of 4

0.9.0

Highlights
* Initial support for graph quantization to programmatically generate a quantized model from a floating-point one. ImageNet examples with PTQ can be found here: https://github.com/Xilinx/brevitas/tree/master/src/brevitas_examples/imagenet_classification/ptq .
* Initial support for QuantMultiheadAttention, which is leveraged for e.g. ViT support above.
* Various improvements to graph equalization, which are leveraged in the PTQ examples above.
* New accumulation-aware quantizers, to train for low-precision accumulation, based on our A2Q paper https://arxiv.org/abs/2301.13376 .
* Experimental support for BatchQuant quantizer, based on https://arxiv.org/abs/2105.08952 , currently still untested.
* Initial support for learned rounding.

Overview of changes

Graph quantization

* Initial graph quantization support by Giuseppe5 in https://github.com/Xilinx/brevitas/pull/549 https://github.com/Xilinx/brevitas/pull/574 https://github.com/Xilinx/brevitas/pull/532 https://github.com/Xilinx/brevitas/pull/579

Quantized layers

* Initial support for QuantMultiheadAttention https://github.com/Xilinx/brevitas/pull/568
* Breaking change: rename Quant(Adaptive)AvgPool to Trunc(Adaptive)AvgPool by volcacius in https://github.com/Xilinx/brevitas/pull/562

Quantizers

* Weight normalization-based integer quantizers by i-colbert in https://github.com/Xilinx/brevitas/pull/559
* Accumulator-aware weight quantization by i-colbert in https://github.com/Xilinx/brevitas/pull/567
* BatchQuant quantizers support by volcacius in https://github.com/Xilinx/brevitas/pull/563

QuantTensor

* Support to move QuantTensor across devices by Giuseppe5 in https://github.com/Xilinx/brevitas/pull/528
* Initial support for interpolate and pixel_shuffle by volcacius in https://github.com/Xilinx/brevitas/pull/578

PTQ

* Batch Norm support in graph equalization by Giuseppe5 in https://github.com/Xilinx/brevitas/pull/531
* Mul support in graph equalization by Giuseppe5 in https://github.com/Xilinx/brevitas/pull/530
* Learned round support by Giuseppe5 in https://github.com/Xilinx/brevitas/pull/573
* MultiheadAttention and LayerNorm support in graph equalization by Giuseppe5 in https://github.com/Xilinx/brevitas/pull/555
* Fix calibration over large number of batches by Giuseppe5 in https://github.com/Xilinx/brevitas/pull/523

Export

* Itemize scalar quantize args only in TorchScript QCDQ by volcacius in https://github.com/Xilinx/brevitas/pull/561
* Round avgpool export fixes by volcacius in https://github.com/Xilinx/brevitas/pull/562

CI, linting

* Linter isort by Giuseppe5 in https://github.com/Xilinx/brevitas/pull/505
* CI: bump isort from 5.10.1 to 5.11.5 by Giuseppe5 in https://github.com/Xilinx/brevitas/pull/540
* Test: enable parallelism with pytest-xdist by Giuseppe5 in https://github.com/Xilinx/brevitas/pull/513
* GHA workflow improvement by Giuseppe5 in https://github.com/Xilinx/brevitas/pull/507
* Add support for yapf by Giuseppe5 in https://github.com/Xilinx/brevitas/pull/511

FX

* Disable FX backport on 1.8.1+ by volcacius in https://github.com/Xilinx/brevitas/pull/504

Examples
* Pretrained Resnet18 example on CIFAR10 targeting FINN by volcacius in https://github.com/Xilinx/brevitas/pull/577
* Graph quantization + PTQ examples and benchmarking scripts by Giuseppe5 in https://github.com/Xilinx/brevitas/pull/547 https://github.com/Xilinx/brevitas/pull/575 https://github.com/Xilinx/brevitas/pull/576

For the **Full Changelog** please check : https://github.com/Xilinx/brevitas/compare/v0.8.0...v0.9.0

bnn_pynq-r2
Model definition and pretrained 4b variant of ResNet18 for FINN deployment. Available under the bnn_pynq examples:

python
from brevitas_examples.bnn_pynq.models import resnet18_4w4a
quant_model = resnet18_4w4a(pretrained=True)

0.8.0

What's Changed
* Add support for PyTorch 1.11-1.13.1. Brevitas 0.8 supports PyTorch 1.5.1 to 1.13.1, with 1.10+ suggested.
* Deprecate support for Python 3.6, 3.7+ is now required.
* Add support for export to ONNX QCDQ for <= int8 quantization, for out of the box execution with onnxruntime or similar backends.
* Extend support for export to ONNX QOps to <= int8 quantization, for out of the box execution with onnxruntime or similar backends.
* Add experimental support for export to torch QCDQ for <= int32 quantization, as an entry point for future MLIR integration with torch-mlir.
* Add support for QuantRNN, QuantLSTM, w/ support for CIFG, bidirectional layers, shared input-hidden gates, shared quantizers, training-time JIT compilation, and partial export support to ONNX (QONNX and QCDQ).
* Improve support for zero-point for both weights and activations quantization.
* New default asymmetric activation quantizer based on percentile rather than min/max.
* Add more built-in quantizers (symmetric per-channel, asymmetric per-channel, symmetric decoupled per-channel).
* Simplify interface for activation calibration.
* Simplify interface for bias correction.
* Initial support for QuantEmbedding.
* Deprecate support for XIR and PyXIR export flows.
* Many bug fixes and minor improvements.

New Contributors
* fd0r made their first contribution in https://github.com/Xilinx/brevitas/pull/434
* omarperacha made their first contribution in https://github.com/Xilinx/brevitas/pull/483
* andrei-stoian-zama made their first contribution https://github.com/Xilinx/brevitas/pull/470

**Full Changelog**: https://github.com/Xilinx/brevitas/compare/v0.7.1...v0.8.0

0.7.1

Fixes
- Various issues in the arithmetic of QuantTensor
- Remove a requirement on find_unused_parameters=True in DDP
- Bias quantization not being enabled if bias is added to a layer post init
- Sharing per-tensor weight quantizer
- Improve implementation of zero-point from stats
- Bias export in QOp ONNX

**Full Changelog**: https://github.com/Xilinx/brevitas/compare/v0.7.0...v0.7.1

0.7.0

Breaking changes
- DPUv1 specific export flow has been deprecated (since DPUv1 has been deprecated in Vitis AI).
- Support for PyTorch < 1.5 has been deprecated.
- The previous implementation of graph quantization has been deprecated.

Fixes
- Issues between statistics collection in quantized activations and BREVITAS_JIT=1 should be solved.
- Statistics collection in quantized activations is now done with a Buffer before switching to a learned Parameter, to keep things consistent in distributed training.
- Custom ONNX functions are now properly registered with PyTorch.
- Various other minor fixes, see full changelog below.

Features
- Support for various more operators in QuantTensor.
- Initial support for post-training quantization through statistics collection, bias correction, and equalization.
- Initial support for FX-based graph quantization, currently targeting FlexML (an internal toolchain) only.
- Various other minor enhancements, see full changelog below.

**Full Changelog**: https://github.com/Xilinx/brevitas/compare/v0.6.0...v0.7.0

0.6.0

Breaking changes
- Quantizers now require to specify a matching proxy class as `proxy_class` attribute. This is necessary to export more custom quantization techniques through BrevitasONNX. Quantization solvers like `WeightQuantSolver` already specify their corresponding `proxy_class`. Any custom quantizer that doesn't inherit from built-in solvers or quantizers will break.

Features
- New `brevitas.fx` subpackage with:
- A backport of `torch.fx` from version 1.8.1 to earlier versions of PyTorch down to 1.3.1.
- A generalized tracer (`brevitas.fx.value_tracer`) that is capable of partially evaluating against the `concrete_args` without reducing down to constants, as illustrated here https://github.com/pytorch/pytorch/issues/56862. This allows to trace through conditionals and unpacking of tuples as long as representative input data is provided.
- A symbolic tracer that accounts for Brevitas layers as leaf modules (`brevitas.fx.brevitas_symbolic_trace`) and its generalized variant (`brevitas.fx.brevitas_value_trace`).
- Port existing graph quantization transformations in `brevitas.graph` to `brevitas.fx`. Still not ready for easy public consumption, but useful to anyone that knows what they are doing.
- Rewrite bias export in the FINN ONNX export flow.
- Add DPURound, with matching STE implementations and wrappers.
- Add matching implementations for the symbolic ops Quant and DecoupledQuant in the BrevitasONNX export flow.

Bugfixes
- Fix leftover issues with 16b datatypes not being preserved after quantization during mixed-precision training.
- Fix per-channel quantization on QuantConvTranspose1d/2d.
- Fix per-channel quantization whenever two layers with quantized weights share the same quantizer.
- Fix export from non-CPU devices.

0.5.1

Highlights

Minor release with a bunch of fixes:
- Fix compatibility with latest onnx (1.9+) by adding a dependency on onnxoptimizer.
- Fix issues with calls to view on non-contiguous data in recent PyTorch versions by switching to reshape.
- Fix a bunch of typos in the README.
- Fix a casting issue that was preventing mixed-precision training from working (it's still generally not reccomended).

Thanks to all the contributors.

Page 2 of 4

Releases

Has known vulnerabilities

Previous Next

Brevitas

Page 2 of 4

0.9.0

0.8.0

0.7.1

0.7.0

0.6.0

0.5.1

Page 2 of 4

Links

Releases