Torchserve

Latest version: v0.12.0

Safety actively analyzes 702662 Python packages for vulnerabilities to keep your Python projects secure.

Page 5 of 7

0.7.1

Not secure

This is the release of TorchServe v0.7.1.
Security
+ Upgraded com.google.code.gson:gson from 2.10 to 2.10.1 in serving sdk - https://github.com/pytorch/serve/pull/2096 snyk-bot
+ Upgraded ubuntu from 20.04 to rolling in Dockerfile files - https://github.com/pytorch/serve/pull/2066, https://github.com/pytorch/serve/pull/2065, https://github.com/pytorch/serve/pull/2064 msaroufim
+ Update to safe snakeyaml, grpc and gradle - https://github.com/pytorch/serve/pull/2081 jack-gits
Updated Dockerfile.dev to install gnupg before calling apt-key del 7fa2af80 - https://github.com/pytorch/serve/pull/2076 yeahdongcn

Dependency Upgrades
+ Support PyTorch 1.13.1 - https://github.com/pytorch/serve/pull/2078 agunapal

Improvements
+ Removed bad eval when onnx session used - https://github.com/pytorch/serve/pull/2034 msaroufim
+ Updated runner label in regression_tests_gpu.yml - https://github.com/pytorch/serve/pull/2080 lxning
+ Updated nightly benchmark config - https://github.com/pytorch/serve/pull/2092 lxning

Documentation
+ Added TorchServe 2022 blogs in Readme - https://github.com/pytorch/serve/pull/2060 msaroufim
The blogs are [Torchserve Performance Tuning, Animated Drawings Case-Study](https://pytorch.org/blog/torchserve-performance-tuning/), [Walmart Search: Serving Models at a Scale on TorchServe](https://medium.com/walmartglobaltech/search-model-serving-using-pytorch-and-torchserve-6caf9d1c5f4d), [Scaling inference on CPU with TorchServe](https://www.youtube.com/watch?v=066_Jd6cwZg), and [TorchServe C++ backend](https://www.youtube.com/watch?v=OSmGGDpaesc).
+ Fixed HuggingFace large model instruction - https://github.com/pytorch/serve/pull/2087 HamidShojanazeri
+ Reworded examples Readme to highlight examples - https://github.com/pytorch/serve/pull/2086 agunapal
+ Updated torchserve_on_win_native.md - https://github.com/pytorch/serve/pull/2050 blackrabbit
+ Fixed typo in batch inference md - https://github.com/pytorch/serve/pull/2049 MasoudKaviani

Deprecation
+ Deprecated future package and drop Python2 support - https://github.com/pytorch/serve/pull/2082 namannandan

Platform Support
Ubuntu 16.04, Ubuntu 18.04, Ubuntu 20.04 MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4). TorchServe now requires Python 3.8 and above, and JDK17.

GPU Support

0.7.0

Not secure

This is the release of TorchServe v0.7.0.

New Examples
+ HF + Better Transformer integration https://github.com/pytorch/serve/pull/2002 HamidShojanazeri

Better Transformer / Flash Attention & Xformer Memory Efficient provides out of box performance with major speed ups for [PyTorch Transformer encoders](https://pytorch.org/blog/a-better-transformer-for-fast-transformer-encoder-inference/). This has been integrated into Torchserve HF Transformer example, please read more about this integration [here]( https://medium.com/pytorch/bettertransformer-out-of-the-box-performance-for-huggingface-transformers-3fbe27d50ab2).

Main speed ups in Better Transformers comes from exploiting sparsity on padded inputs and kernel fusions. As a result you would see the biggest gains when dealing with larger workloads, such sequences with longer paddings and larger batch sizes.

In our benchmarks on P3 instances with 4 V100 GPUs, using Torchserve benchmarking workloads, throughput has shown significant improvement with large batch sizes. 45.5% increase with batch size 8; 50.8% increase with batch size 16; 45.2% increase with batch size 32; 47.2% increase with batch size 64. and 17.2 increase with batch size 4. These number can vary based on your workload (batch size , padding percentage) and your hardware. Please look up some other benchmarks in the [blog post](https://medium.com/pytorch/bettertransformer-out-of-the-box-performance-for-huggingface-transformers-3fbe27d50ab2).

+ `torch.compile()` support https://github.com/pytorch/serve/pull/1960 msaroufim

We've added experimental support for PT 2.0 as in torch.compile() support within torchserve. To use it you need to supply a file `compile.json` when archiving your model to specify which backend you want. We've also enabled by default `mode=reduce-overhead` which is ideally suited for smaller batch sizes which are more common for inference. We recommend for now to leverage GPUs with tensor cores available like A10G or A100 since you're likely to see the greatest speedups there.

On training we've seen speedups ranging from 30% to 2x https://pytorch.org/get-started/pytorch-2.0/ but we haven't ran any performance benchmarks yet for inference. Until then we recommend you continue leveraging other runtimes like TensorRT or IPEX for accelerated inference which we highlight in our `performance_guide.md`. There are a few important caveats to consider when you're using torch.compile: changes in batch sizes will cause recompilations so make sure to leverage a small batch size, there will be additional overhead to start a model since you need to compile it first and you'll likely still see the largest speedups with TensorRT.

However, we hope that adding this support will make it easier for you to benchmark and try out PT 2.0. Learn more here https://github.com/pytorch/serve/tree/master/examples/pt2

Dependency Upgrades
+ Support Python 3.10 https://github.com/pytorch/serve/pull/2031 agunapal
+ Support PyTorch 1.13 and Cuda 11.7 https://github.com/pytorch/serve/pull/1980 agunapal
+ Update docker default from Ubuntu 18.04 to Ubuntu 20.04 (LTS) https://github.com/pytorch/serve/pull/1970 LuigiCerone

Improvements
+ KFServe upgrade to 0.9 - https://github.com/pytorch/serve/issues/1860 Jagadeesh
+ Added pyyaml for python venv https://github.com/pytorch/serve/pull/2014 lxning
+ Added HG BERT better transformer benchmark https://github.com/pytorch/serve/issues/2024 lxning

Documentation
+ Fixed response time unit https://github.com/pytorch/serve/pull/2015 lxning

Platform Support
Ubuntu 16.04, Ubuntu 18.04, MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4). TorchServe now requires Python 3.8 and above, and JDK17.

GPU Support

0.6.1

Not secure

This is the release of TorchServe v0.6.1.

New Features
+ Metrics Caching in Python backend - https://github.com/pytorch/serve/pull/1954 maaquib joshuaan7
+ ONNX models served via ORT runtime & docs for TensorRT https://github.com/pytorch/serve/pull/1857. msaroufim
+ lPEX launcher core pinning https://github.com/pytorch/serve/pull/1401 . min-jean-cho - to learn more https://pytorch.org/tutorials/intermediate/torchserve_with_ipex.html

New Examples
+ DLRM example via torchrec https://github.com/pytorch/serve/issues/1648 mreso
+ Scriptable tokenizer example for text classification https://github.com/pytorch/serve/pull/1691 mreso
+ Loading large Huggingface models by using accelerate https://github.com/pytorch/serve/pull/1933 jagadeeshi2i
+ Stable diffusion Deepspeed MII example https://github.com/pytorch/serve/pull/1920 jagadeeshi2i
+ HuggingFace diffuser example https://github.com/pytorch/serve/pull/1904 jagadeeshi2i
+ On-premise near real-time video inference https://github.com/pytorch/serve/pull/1867 agunapal
+ fsspec for large scale batch inference from cloud buckets https://github.com/pytorch/serve/pull/1927 kirkpa
+ Torchdata example for unified training and inference preprocessing pipelines https://github.com/pytorch/serve/pull/1940 PratsBhatt
+ Wav2Vec2 SpeechToText from Huggingface https://github.com/pytorch/serve/pull/1939 altre

Dependency Upgrades
+ Support PyTorch 1.12 and Cuda 11.6 https://github.com/pytorch/serve/pull/1767 lxning
+ Upgraded to JDK17 - https://github.com/pytorch/serve/issues/1619 rohithkrn
+ Bumped gson version for security https://github.com/pytorch/serve/pull/1650 lxning

Improvements
+ Optimized gRPC workflow performance https://github.com/pytorch/serve/pull/1854 for gRPC workflow. lxning
+ Fixed worker shown as ready in DescribeModel endpoint before model is loaded https://github.com/pytorch/serve/issues/1679. lxning
+ Gracefully handle decoding exceptions in python backend https://github.com/pytorch/serve/pull/1789 msaroufim
+ Added handle OPTIONS in management API https://github.com/pytorch/serve/pull/1774 xyang16
+ Fixed model status API in KServe https://github.com/pytorch/serve/pull/1773 jagadeeshi2i
+ Fixed process verification in pid file - https://github.com/pytorch/serve/pull/1866 rohithkrn
+ Updated Nvidia Waveglow/Tacotron2 https://github.com/pytorch/serve/pull/1905 kbumsik
+ Added dev mode in `install_from_src.py` https://github.com/pytorch/serve/pull/1856 msaroufim
+ Added the PV creation for K8 setup https://github.com/pytorch/serve/pull/1751 jagadeeshi2i
+ Fixed volume permission in kubernetes setup https://github.com/pytorch/serve/pull/1747 jagadeeshi2i
+ Upgraded hpa with v2beta2 api version https://github.com/pytorch/serve/pull/1760 jagadeeshi2i
+ Fixed gradle deprecation method https://github.com/pytorch/serve/pull/1936 lxning
+ Updated plugins/gradle.properties https://github.com/pytorch/serve/pull/1791 liyaodev
+ Fixed pynvml import failure https://github.com/pytorch/serve/pull/1882 lxning
+ Added pynvml exception management https://github.com/pytorch/serve/pull/1809 lromor
+ Fixed an erroneous logging format string and pylint pragma https://github.com/pytorch/serve/pull/1630 bradlarsen
+ Fixed broken path joins and unclosed files https://github.com/pytorch/serve/pull/1709 DPeled

Build and CI
+ Added ubuntu 20.04 GPU in docker build - https://github.com/pytorch/serve/pull/1773 msaroufim
+ Added spellchecking and link checking automation https://github.com/pytorch/serve/pull/1855 sadra-barikbin
+ Added full release automation https://github.com/pytorch/serve/pull/1739 msaroufim
+ Added workflow for pushing Conda nightly binaries https://github.com/pytorch/serve/pull/1685 agunapal
+ Added code coverage https://github.com/pytorch/serve/pull/1665 in CI build msaroufim
+ Unified documentation build dependencies https://github.com/pytorch/serve/pull/1759 msaroufim
+ Added skipping spellcheck if no changed files https://github.com/pytorch/serve/pull/1919 for skipping spellcheck if no changed files. maaquib
+ Added skipping flaky Java Windows test cases https://github.com/pytorch/serve/pull/1746 msaroufim
+ Added alarm on failed github action https://github.com/pytorch/serve/pull/1781 msaroufim

Documentation
+ Updated FAQ https://github.com/pytorch/serve/pull/1393 for how to decode international language lxning
+ Improved KServe documentation https://github.com/pytorch/serve/pull/1807 jagadeeshi2i
+ Updated `[examples/intel_extension_for_pytorch/README.md` https://github.com/pytorch/serve/pull/1816 min-jean-cho
+ Fixed typos and dead links in doc.

Deprecations
+ Deprecated old `ci/benchmark/buildspec.yml` https://github.com/pytorch/serve/pull/1658 lxning
+ Deprecated old `docker/Dockerfile.neuron.dev` https://github.com/pytorch/serve/pull/1775 in favor of AWS SageMaker DLC. rohithkrn
+ Deprecated redundant `LICENSE.txt` https://github.com/pytorch/serve/pull/1801 msaroufim

Platform Support
Ubuntu 16.04, Ubuntu 18.04, MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4). TorchServe now requires Python 3.8 and above, and JDK17.

GPU Support

0.6.0

Not secure

This is the release of TorchServe v0.6.0.

New Features
+ Support PyTorch 1.11 and Cuda 11.3 - Added [support](https://github.com/pytorch/serve/pull/1592) for PyTorch 1.11 and Cuda 11.3.
+ Universal Auto Benchmark and Dashboard Tool - Added [one command line tool](https://github.com/pytorch/serve/tree/master/benchmarks#Auto-Benchmarking-with-Apache-Bench) for model analyzer to get benchmark report([sample](https://github.com/pytorch/serve/blob/master/benchmarks/sample_report.md)) and dashboard on any device.
+ HuggingFace model parallelism integration - Added [example](https://github.com/pytorch/serve/pull/1510) for HuggingFace model parallelism integration.

Build and CI
+ Added nightly benchmark dashboard - Added [nightly benchmark dashboard](https://github.com/pytorch/serve/pull/1589).
+ Migrated CI, nightly binary and docker build to github workflow - Added [CI](https://github.com/pytorch/serve/pull/1576), [docker](https://github.com/pytorch/serve/pull/1562) migration.
+ Fixed gpu regression test `buildspec.yaml` - Added [fixing](https://github.com/pytorch/serve/pull/1479) for gpu regression test `buildspec.yaml`.

Documentation
+ Updated documentation - Updated [TorchServe](https://github.com/pytorch/serve/pull/1583), [benchmark](https://github.com/pytorch/serve/pull/1572), [snapshot](https://github.com/pytorch/serve/pull/1483) and [configuration](https://github.com/pytorch/serve/pull/1551) documentation; fixed broken [documentation build](https://github.com/pytorch/serve/pull/1570)

Deprecations
+ Deprecated old `benchmark/automated` [directory](https://github.com/pytorch/serve/pull/1594) in favor of new Github Action based workflow

Improvements
+ Fixed workflow threads cleanup - Added [fixing](https://github.com/pytorch/serve/issues/1511) to clean workflow inference threadpool.
+ Fixed empty model url - Added [fixing](https://github.com/pytorch/serve/pull/1523) for empty model url in model archiver.
+ Fixed load model failure - Added [support](https://github.com/pytorch/serve/pull/1508) for loading a model directory.
+ HuggingFace text generation example - Added [text generation example](https://github.com/pytorch/serve/pull/1473).
+ Updated metrics json and qlog format log - Added [support](https://github.com/pytorch/serve/pull/1491) for metrics json and qlog format log in log4j2.
+ Added cpu, gpu and memory usage - Added [cpu, gpu and memory usage](https://github.com/pytorch/serve/pull/1453) in `benchmark-ab.py` report.
+ Added exception for `torch < 1.8.1` - Added [exception](https://github.com/pytorch/serve/pull/1556) to notify `torch < 1.8.1`.
+ Replaced hard code in `install_dependencies.py` - Added [sys.executable](https://github.com/pytorch/serve/pull/1555) in `install_dependencies.py`.
+ Added default envelope for workflow - Added [default envelope](https://github.com/pytorch/serve/pull/1550) in model manager for workflow.
+ Fixed multiple docker build errors - Fixed [/home/venv write permission](https://github.com/pytorch/serve/pull/1514), [typo](https://github.com/pytorch/serve/pull/1561) in docker and added [common requirements](https://github.com/pytorch/serve/pull/1509) in docker.
+ Fixed snapshot test - Added [fixing](https://github.com/pytorch/serve/pull/1524) for snapshot test.
+ Updated `model_zoo.md` - Added [dog breed, mmf and BERT](https://github.com/pytorch/serve/pull/1497) in model zoo.
+ Added `nvgpu` in common requirements - Added [nvgpu](https://github.com/pytorch/serve/pull/1474) in common dependencies.
+ Fixed Inference API ping response - Fixed [typo](https://github.com/pytorch/serve/pull/1541) in Inference API ping response.

Platform Support
Ubuntu 16.04, Ubuntu 18.04, MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4). TorchServe now requires Python 3.8 and above.

GPU Support

0.5.3

Not secure

This is the release of TorchServe v0.5.3.

New Features
+ KServe V2 support - Added [support](https://github.com/pytorch/serve/pull/1340) for KServe V2 protocol.
+ Model customized metadata support - Extended [managementAPI](https://github.com/pytorch/serve/pull/1421) to support customized metadata from handler.

Improvements
+ Upgraded [log4j2](https://logging.apache.org/log4j/2.x/security.html) version to 2.17.1 - Added [log4j upgrade](https://github.com/pytorch/serve/pull/1395) to address [CVE-2021-44832](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-44832).
+ Upgraded pillow to 9.0.0, python support upgraded to py3.8/py3.9 - Added [docker](https://github.com/pytorch/serve/pull/1435), [install dependency ](https://github.com/pytorch/serve/pull/1459) upgrade.
+ GPU utilization and GPU memory usage metrics support - Added [support](https://github.com/pytorch/serve/pull/1453) for GPU utilization and GPU memory usage metrics in benchmarks.
+ Workflow benchmark support - Added [support](https://github.com/pytorch/serve/pull/1445) for workflow benchmark.
+ benchmark-ab.py warmup support - Added [support](https://github.com/pytorch/serve/pull/1413) for warmup in benchmark-ab.py.
+ Multiple inputs for a model inference example - Added [example](https://github.com/pytorch/serve/pull/1403) to support multiple inputs for a model inference.
+ Documentation refactor - Improved [documention](https://github.com/pytorch/serve/pull/1424).
+ Added API auto-discovery - Added [support](https://github.com/pytorch/serve/pull/1418) for API auto-discovery.
+ Nightly build support - Added [support](https://pypi.org/project/torchserve-nightly/) for Github action nightly build `pip install torchserve-nightly`

Platform Support
Ubuntu 16.04, Ubuntu 18.04, MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4). TorchServe now requires Python 3.8 and above.

GPU Support

0.5.2

Not secure

This is a hotfix release of Log4j issue.

Log4j Fixing
+ **Upgrade [log4j2](https://logging.apache.org/log4j/2.x/security.html) version to 2.17.0** - Added [log4j upgrade](https://github.com/pytorch/serve/pull/1378) to address [CVE-2021-45105](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-45105).

Page 5 of 7

Releases

Has known vulnerabilities

Previous Next

Torchserve

Page 5 of 7

0.7.1

0.7.0

0.6.1

0.6.0

0.5.3

0.5.2

Page 5 of 7

Links

Releases