Clearml-serving

Latest version: v1.3.0

Safety actively analyzes 682244 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

1.3.0

Stable Release

* Features
* 20% Overall performance increase :rocket: thank you python 3.11 :fire:
* gRPC channel configuration 49 amirhmk
* Huggingface Transformer example

* Bug fixes
* 47 / 46 fix numpy compatibility, galleon , anon-it
* 50 fix triton examples, amirhmk
* 45 add storage environment variables, besrym

1.2.0

Stable Release

* Features
* GPU Performance improvements, 50%-300% improvement over vanilla Triton
* Performance improvements on CPU, optimize uvloop + multi-processing
* Huggingface Transformer example
* Binary input support, 37 , thanks Aleksandar1932

* Bug fixes
* stdout/stderr in inference service was not logged to dedicated Task

1.1.0

Stable Release

**Notice: This release is not backwards compatible - see notes below on upgrading**

* Breaking Changes
* Triton engine size supports variable request size (-1)

* Features & Bug fixes
* Add version number of serving session task
* Triton engine support for variable request (matrix) sizes
* Triton support, fix --aux-config to support more configurations elements
* Huggingface Transformer support
* `Preprocess` class as module (see note below)

**Note**: To add a `Preprocess` class from a module (the entire module folder will be packaged)

preprocess_folder
├── __init__.py from .sub.some_file import Preprocess
└── sub
└── some_file.py

Pass the top folder as a path for `--preprocess`, for example:

clearml-serving --id <serving_session_id> model add --preprocess /path/to/preprocess_folder ...

1.0

1. Take down the serving containers (docker-compose or k8s)
2. Update the clearml-serving CLI `pip3 install -U clearml-serving`
3. Re-add a single existing endpoint with `clearml-serving model add ...` (press yes when asked)
(it will upgrade the clearml-serving session definitions)
4. Pull latest serving containers (`docker-compose pull ...` or k8s)
5. Re-spin serving containers (docker-compose or k8s)

1.0.0

Stable Release

**Notice: This release is not backwards compatible**


* Breaking Changes
* pre / post processing class functions get 3 arguments, see [example](https://github.com/allegroai/clearml-serving/blob/a12311c7d6f273cb02d1e09cf1135feb2afc3338/clearml_serving/preprocess/preprocess_template.py#L27)
* Add support for per-request state storage, passing information between the pre/post processing functions

* Features & Bug fixes
* Optimize serving latency while collecting statistics
* Fix metric statistics collecting auto-refresh issue
* Fix live update of model preprocessing code
* Add `pandas` to the default serving container
* Add per endpoint/variable statistics collection control
* Add `CLEARML_EXTRA_PYTHON_PACKAGES` for easier additional python package support (serving inference container)
* Upgrade Nvidia Triton base container image to 22.04 (requires Nvidia drivers 510+)
* Add Kubernetes Helm chart

0.9.0

Redesign Release

**Notice: This release is not backwards compatible**


* Easy to deploy & configure
* Support Machine Learning Models (Scikit Learn, XGBoost, LightGBM)
* Support Deep Learning Models (Tensorflow, PyTorch, ONNX)
* Customizable RestAPI for serving (i.e. allow per model pre/post-processing for easy integration)
* Flexible
* On-line model deployment
* On-line endpoint model/version deployment (i.e. no need to take the service down)
* Per model standalone preprocessing and postprocessing python code
* Scalable
* Multi model per container
* Multi models per serving service
* Multi-service support (fully separated multiple serving service running independently)
* Multi cluster support
* Out-of-the-box node auto-scaling based on load/usage
* Efficient
* multi-container resource utilization
* Support for CPU & GPU nodes
* Auto-batching for DL models
* Automatic deployment
* Automatic model upgrades w/ canary support
* Programmable API for model deployment
* Canary A/B deployment
* Online Canary updates
* Model Monitoring
* Usage Metric reporting
* Metric Dashboard
* Model performance metric
* Model performance Dashboard


Features:
- [x] FastAPI integration for inference service
- [x] multi-process Gunicorn for inference service
- [x] Dynamic preprocess python code loading (no need for container/process restart)
- [x] Model files download/caching (http/s3/gs/azure)
- [x] Scikit-learn. XGBoost, LightGBM integration
- [x] Custom inference, including dynamic code loading
- [x] Manual model upload/registration to model repository (http/s3/gs/azure)
- [x] Canary load balancing
- [x] Auto model endpoint deployment based on model repository state
- [x] Machine/Node health metrics
- [x] Dynamic online configuration
- [x] CLI configuration tool
- [x] Nvidia Triton integration
- [x] GZip request compression
- [x] TorchServe engine integration
- [x] Prebuilt Docker containers (dockerhub)
- [x] Docker-compose deployment (CPU/GPU)
- [x] Scikit-Learn example
- [x] XGBoost example
- [x] LightGBM example
- [x] PyTorch example
- [x] TensorFlow/Keras example
- [x] Model ensemble example
- [x] Model pipeline example
- [x] Statistics Service
- [x] Kafka install instructions
- [x] Prometheus install instructions
- [x] Grafana install instructions

Page 1 of 2

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.