Bentoml

Latest version: v1.2.18

Safety actively analyzes 638901 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 14 of 21

0.9.1

0.9.0

What's New

TLDR;
* New input/output adapter design that let's user choose between batch or non-batch implementation
* Speed up the API model server docker image build time
* Changed the recommended import path of artifact classes, now artifact classes should be imported from `bentoml.frameworks.*`
* Improved python pip package management
* Huggingface/Transformers support!!
* Managed packaged models with Labels API
* Support GCS(Google Cloud Storage) as model storage backend in YataiService
* Current Roadmap for feedback: https://github.com/bentoml/BentoML/discussions/1128


New Input/Output adapter design

A massive refactoring on BentoML's inference API and input/output adapter redesign, lead by bojiang with help from akainth015.

**BREAKING CHANGE:** API definition now requires declaring if it is a batch API or non-batch API:

python
from typings import List
from bentoml import env, artifacts, api, BentoService
from bentoml.adapters import JsonInput
from bentoml.types import JsonSerializable type annotations are optional

env(infer_pip_packages=True)
artifacts([SklearnModelArtifact('classifier')])
class MyPredictionService(BentoService):

api(input=JsonInput(), batch=True)
def predict_batch(self, parsed_json_list: List[JsonSerializable]):
results = self.artifacts.classifier([j['text'] for j in parsed_json_list])
return results

api(input=JsonInput()) default batch=False
def predict_non_batch(self, parsed_json: JsonSerializable):
results = self.artifacts.classifier([parsed_json['text']])
return results[0]


For APIs with `batch=True`, the user-defined API function will be required to process a list of input item at a time, and return a list of results of the same length. On the contrary, `api` by default uses `batch=False`, which processes one input item at a time. Implementing a batch API allow your workload to benefit from BentoML's adaptive micro-batching mechanism when serving online traffic, and also will speed up offline batch inference job. We recommend using `batch=True` if performance & throughput is a concern. Non-batch APIs are usually easier to implement, good for quick POC, simple use cases, and deploying on Serverless platforms such as AWS Lambda, Azure function, and Google KNative.

Read more about this change and example usage here: https://docs.bentoml.org/en/latest/api/adapters.html

**BREAKING CHANGE:** For `DataframeInput` and `TfTensorInput` users, it is now required to add `batch=True`

DataframeInput and TfTensorInput are special input types that only support accepting a batch of input at one time.

Input data validation while handling batch input

When the API function received a list of input, it is now possible to reject a subset of the input data and return an error code to the client, if the input data is invalid or malformated. Users can do this via the `InferenceTaskdiscard` API, here's an example:

python
from typings import List
from bentoml import env, artifacts, api, BentoService
from bentoml.adapters import JsonInput
from bentoml.types import JsonSerializable, InferenceTask type annotations are optional

env(infer_pip_packages=True)
artifacts([SklearnModelArtifact('classifier')])
class MyPredictionService(BentoService):

api(input=JsonInput(), batch=True)
def predict_batch(self, parsed_json_list: List[JsonSerializable], tasks: List[InferenceTask]):
model_input = []
for json, task in zip(parsed_json_list, tasks):
if "text" in json:
model_input.append(json['text'])
else:
task.discard(http_status=400, err_msg="input json must contain `text` field")

results = self.artifacts.classifier(model_input)

return results


The number of tasks got discarded plus the length of the results array returned, should be equal to the length of the input list, this will allow BentoML to match the results back to tasks that have not yet been discarded.

Allow fine-grained control of the HTTP response, CLI inference job output, etc. E.g.:

python
import bentoml
from bentoml.types import JsonSerializable, InferenceTask, InferenceError type annotations are optional

class MyService(bentoml.BentoService):

bentoml.api(input=JsonInput(), batch=False)
def predict(self, parsed_json: JsonSerializable, task: InferenceTask) -> InferenceResult:
if task.http_headers['Accept'] == "application/json":
predictions = self.artifact.model.predict([parsed_json])
return InferenceResult(
data=predictions[0],
http_status=200,
http_headers={"Content-Type": "application/json"},
)
else:
return InferenceError(err_msg="application/json output only", http_status=400)


Or when batch=True:
python
import bentoml
from bentoml.types import JsonSerializable, InferenceTask, InferenceError type annotations are optional

class MyService(bentoml.BentoService):

bentoml.api(input=JsonInput(), batch=True)
def predict(self, parsed_json_list: List[JsonSerializable], tasks: List[InferenceTask]) -> List[InferenceResult]:
rv = []
predictions = self.artifact.model.predict(parsed_json_list)
for task, prediction in zip(tasks, predictions):
if task.http_headers['Accept'] == "application/json":
rv.append(
InferenceResult(
data=prediction,
http_status=200,
http_headers={"Content-Type": "application/json"},
))
else:
rv.append(InferenceError(err_msg="application/json output only", http_status=400))
or task.discard(err_msg="application/json output only", http_status=400)
return rv


Other adapter changes:

* Added a 3 base adapters for implementing advanced adapters: FileInput, StringInput, MultiFileInput

* Implementing new adapters that support micro-batching is a lot easier now: https://github.com/bentoml/BentoML/blob/v0.9.0.pre/bentoml/adapters/base_input.py

* Per inference task prediction log 1089

* More adapters support launching batch inference job from BentoML CLI run command now, see API reference for detailed examples: https://docs.bentoml.org/en/latest/api/adapters.html

Docker Build Improvements

* Optimize docker image build time (1081) kudos to ZeyadYasser!!
* Per python minor version base image to speed up image building 1101 1096, thanks gregd33!!
* Add "latest" tag to all user-facing docker base images (1046)

Improved pip package management

Setting pip install options in BentoService `env` specification

As suggested here: https://github.com/bentoml/BentoML/issues/1036#issuecomment-682179282, Thanks danield137 for suggesting the `pip_extra_index_url` option!

python
env(
auto_pip_dependencies=True,
pip_index_url='my_pypi_host_url',
pip_trusted_host='my_pypi_host_url',
pip_extra_index_url='extra_pypi_index_url'
)
artifacts([SklearnModelArtifact('model')])
class IrisClassifier(BentoService):
...


**BREAKING CHANGE** Due to this change, we have now removed the previous docker build arg PIP_INDEX_URL and ARG PIP_TRUSTED_HOST, due to it may be conflicting with settings in base image 1036


* Support passing a conda environment.yml file to `env`, as suggested in 725 https://github.com/bentoml/BentoML/issues/725

* When a version is not specified in pip_packages list, it is expected to pin to the version found in the current python session. Now it is doing the same for packages added from adapter and artifact classes

* Support specifying package requirement range now, e.g.:
python
env(pip_packages=["abc==1.3", "foo>1.2,<=1.4"])

It can be any pip version requirement specifier https://pip.pypa.io/en/stable/reference/pip_install/#requirement-specifiers

* Renamed `pip_dependencies` to `pip_packages` and `auto_pip_dependencies` to `infer_pip_packages`, the old API still works but will eventually be deprecated.

GCS support in YataiService

Adding Google Cloud Storage (GCS) support in YataiService, as the storage backend. This is an alternative to AWS S3, MiniIO, or POSIX file system. 1017 - Thank you Korusuke PrabhanshuAttri for creating the GCS support!

YataiService Labels API for model management

Managed packaged models in YataiService with labels API implemented in 1064

1. Add labels to `BentoService.save`
python
svc = MyBentoService()
svc.save(labels={'my_key': 'my_value', 'test': 'passed'})

2. Add label query for CLI commands
* `bentoml get BENTO_NAME`, `bentoml list`, `bentoml deployment list`, `bentoml lambda list`, `bentoml sagemaker list`, `bentoml azure-functions list`

* label query supports `=`, `!=`, `In`, `NotIn`, `Exists`, `DoesNotExists` operator
- e.g. key1=value1, key2!=value2, env In (prod, staging), Key Exists, Another_key DoesNotExist

*Simple key/value label selector*
<img width="1329" alt="Screen Shot 2020-09-03 at 5 38 21 PM" src="https://user-images.githubusercontent.com/670949/92186634-4867c580-ee0c-11ea-8dc8-55c28d6a5130.png">

*Use Exists operator*
<img width="1123" alt="Screen Shot 2020-09-03 at 5 40 57 PM" src="https://user-images.githubusercontent.com/670949/92186755-a3012180-ee0c-11ea-8f68-cf30e95ba482.png">

*Use DoesNotExist operator*
<img width="1327" alt="Screen Shot 2020-09-03 at 5 41 41 PM" src="https://user-images.githubusercontent.com/670949/92186785-bc09d280-ee0c-11ea-9465-a10a8411612a.png">

*Use In operator*
<img width="1348" alt="Screen Shot 2020-09-03 at 5 48 42 PM" src="https://user-images.githubusercontent.com/670949/92187108-b6f95300-ee0d-11ea-9744-45ed182d3ab1.png">

*Use multiple label query*
<img width="1356" alt="Screen Shot 2020-09-03 at 7 07 23 PM" src="https://user-images.githubusercontent.com/670949/92191498-caf68200-ee18-11ea-9679-9f4ea06a5484.png">

3. Roadmap - add web UI for filtering and searching with labels API

New framework support: Huggingface/Transformers

1090 1094 thanks vedashree29296 for contributing this!

Usage & docs: https://docs.bentoml.org/en/stable/frameworks.html#transformers


Bug Fixes:

* Fixed 1030 - bentoml serve fails when packaged on Windows and deployed on Linux 1044
* Handle missing region during SageMaker deployment updates 1049

Internal & Testing:

* Re-organize artifacts related modules 1082, 1085
* Refactoring & improvements around dependency management 1084, 1086
* [TEST/CI] Add tests covering XgboostModelArtifact (1079)
* [TEST/CI] Fix AWS moto related unit tests (1077)
* Lock SQLAlchemy-utils version (1078)

Contributors of 0.9.0 release

Thank you all for contributing to this release!! danield137 ericmand ssakhavi aviaviavi dinakar29 umihui vedashree29296 joerg84 gregd33 mayurnewase narennadig akainth015 yubozhao bojiang

0.8.6

What's New

Yatai service helm chart for Kubernetes deployment [945](https://github.com/bentoml/BentoML/pull/945) jackyzha0

Helm chart offers a convenient way to deploy YataiService to a Kubernetes cluster

bash
Download BentoML source
$ git clone https://github.com/bentoml/BentoML.git
$ cd BentoML

1. Install an ingress controller if your cluster doesn't already have one, Yatai helm chart installs nginx-ingress by default:
$ helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx && helm dependencies build helm/YataiService

2. Install YataiService helm chart to the Kubernetes cluster:
$ helm install -f helm/YataiService/values/postgres.yaml yatai-service YataiService

3. To uninstall the YataiService from your cluster:
$ helm uninstall yatai-service


jackyzha0 added a great tutorial about YataiService helm chart deployment. You can find the guide at https://docs.bentoml.org/en/latest/guides/helm.html

[Experimental] AnnotatedImageInput adapter for image plus additional JSON data [973](https://github.com/bentoml/BentoML/pull/973) ecrows

The AnnotatedImageInput adapter is designed for the common use-cases of image input to include additional information such as object detection bounding boxes, segmentation masks, etc. for prediction. This new adapter significantly improves the developer experience over the previous workaround solution.

**Warning:** Input adapters are currently under refactoring [1002](https://github.com/bentoml/BentoML/issues/1002), we may change the API for AnnotatedImageInput in future releases.

python
from bentoml.adapters import AnnotatedImageInput
from bentoml.artifact import TensorflowSavedModelArtifact
import bentoml

CLASS_NAMES = ['cat', 'dog']

bentoml.artifacts([TensorflowSavedModelArtifact('classifier')]
class PetClassification(bentoml.BentoService):
api(input=AnnotatedImageInput)
def predict(self, image, annotations):
cropped_pets = some_pet_finder(image, annotations)
results = self.artifacts.classifier.predict(cropped_pets)
return [CLASS_NAMES[r] for r in results]


Making a request using `curl`

bash
$ curl -F image=image.png -F annotations=annotations.json http://localhost:5000/predict


You can find the current API reference at https://docs.bentoml.org/en/latest/api/adapters.html#annotatedimageinput

Improvements:

* [992](https://github.com/bentoml/BentoML/pull/992) Make the prediction and feedback loggers log to console by default - jackyzha0
* [952](https://github.com/bentoml/BentoML/pull/952) Add tutorial for deploying BentoService to Azure SQL server to the documentation yashika51

Bug Fixes:

* [987](https://github.com/bentoml/BentoML/pull/987) & [#991](https://github.com/bentoml/BentoML/pull/945) Better AWS IAM roles handles for Sagemaker Deployment - dinakar29
* [995](https://github.com/bentoml/BentoML/pull/995) Fix an edge case for encountering RecursionError when running gunicorn server with `--enable-microbatch` on MacOS bojiang
* [1012](https://github.com/bentoml/BentoML/pull/1012) Fix ruamel.yaml missing issue when using containerized BentoService with Conda. parano

Internal & Testing:

* [983](https://github.com/bentoml/BentoML/pull/983) Move CI tests to Github Actions

Contributors:

Thank you, everyone, for contributing to this exciting release!

bojiang jackyzha0 ecrows dinakar29 yashika51 akainth015

0.8.5

Bug fixes

* API server show blank index page 977 975
* Failed to package pip installed dependencies in some edge cases 978 979

0.8.4

What's New

Breaking Change: JsonInput migrating to batch API 860,953

We are officially changing JsonInput to use the batch-oriented syntax. By now(release 0.8.4), all input adapters in BentoML have migrated to this design. The main difference is that for the user-defined API function, the input parameter is now a list of JSONSerializable objects(Dict, List, Integer, Float, Str) instead of one JSONSerializable object. And the expected return value of the user-defined API function is an Iterable with the exact same length. This makes it possible for API endpoints using JsonInput adapter to take advantage of BentoML's adaptive micro-batching capability.

Here is an example of how JsonInput(formerly JsonHandler) used to work:

python
bentoml.api(input=LegacyJsonInput())
def predict(self, parsed_json):
results = self.artifacts.classifier([parsed_json['text']])
return results[0]


And here is an example with the new JsonInput class:
python
bentoml.api(input=JsonInput())
def predict(self, parsed_json_list):
texts = [j['text'] for j in parsed_json_list])
return self.artifacts.classifier(texts)


The old non-batching JsonInput is still available to help with the transition, simply use `from bentoml.adapters import LegacyJsonInput as JsonInput` to replace the JsonInput or JsonHandler in your code before BentoML 0.8.4. The `LegacyJsonInput` behaves exactly the same as JsonInput in previous releases. We will keep supporting it until BentoML version 1.0.

Custom Web UI support in API Server (839)

Custom web UI can be added to your API server now! Here is an example project: https://github.com/bentoml/gallery/tree/master/scikit-learn/iris-classifier

![bentoml custom web ui](https://raw.githubusercontent.com/bentoml/gallery/master/scikit-learn/iris-classifier/webui.png)

Add your web frontend project directory to your BentoService class and BentoML will automatically bundle all the web UI files and host them when starting the API server:
python
env(auto_pip_dependencies=True)
artifacts([SklearnModelArtifact('model')])
web_static_content('./static')
class IrisClassifier(BentoService):

api(input=DataframeInput())
def predict(self, df):
return self.artifacts.model.predict(df)


Artifact packing & loading workflow 911, 921, 949

We have refactored the Artifact API, which brings more flexibility to how users package their trained models with BentoML's API.

The most noticeable thing a user can do now is to separate model training job and BentoML model serving development - the user can now use the Artifact API to save a trained model from their training job and load it later for creating BentoService class for model serving. e.g.:

Step 1, model training:
python
from sklearn import svm
from sklearn import datasets
from bentoml.artifact import SklearnModelArtifact

if __name__ == "__main__":
Load training data
iris = datasets.load_iris()
X, y = iris.data, iris.target

Model Training
clf = svm.SVC(gamma='scale')
clf.fit(X, y)

save just the trained model with the SklearnModelArtifact to a specific directory
btml_model_artifact = SklearnModelArtifact('model')
btml_model_artifact.pack(clf)
btml_model_artifact.save('/tmp/temp_bentoml_artifact')


Step 2: Build BentoService class with the saved artifact:
python
from bentoml import env, artifacts, api, BentoService
from bentoml.adapters import DataframeInput
from bentoml.artifact import SklearnModelArtifact

env(auto_pip_dependencies=True)
artifacts([SklearnModelArtifact('model')])
class IrisClassifier(BentoService):

api(input=DataframeInput())
def predict(self, df):
Optional pre-processing, post-processing code goes here
return self.artifacts.model.predict(df)

if __name__ == "__main__":
Create a iris classifier service instance
iris_classifier_service = IrisClassifier()

load the previously saved artifact
iris_classifier_service.artifacts.get('model').load('/tmp/temp_bentoml_artifact')

saved_path = iris_classifier_service.save()


This workflow makes developing and debugging BentoService code a lot easier, user no longer needs to retrain their model every time they change something in the BentoService class definition and wants to try it out.

* Note that the old BentoService class method 'pack' has now been deprecated in this release 915

Add `bentoml containerize` command (847,884,941)
bash
$ bentoml containerize --help
Usage: bentoml containerize [OPTIONS] BENTO

Containerizes given Bento into a ready-to-use Docker image.

Options:
-p, --push
-t, --tag TEXT Optional image tag. If not specified, Bento will
generate one from the name of the Bento.


Support multiple images in the same request (828)

A new input adapter class `MultiImageInput` https://docs.bentoml.org/en/latest/api/adapters.html#multiimageinput has been added. It is designed for prediction services that require multiple image files as its input:

python
from bentoml import BentoService
import bentoml

class MyService(BentoService):

bentoml.api(input=MultiImageInput(input_names=('imageX', 'imageY')))
def predict(self, image_groups):
for image_group in image_groups:
image_array_x = image_group['imageX']
image_array_y = image_group['imageY']



Add FileInput adapter(734)

A new input adapter class `FileInput` for handling arbitrary binary files as the input for your prediction service https://github.com/bentoml/BentoML/blob/v0.8.4/bentoml/adapters/file_input.py#L33

Added Ngrok support (917)

Expose your local development model API server over a public URL endpoint, using Ngrok under the hood. To try it out, simply add the `--run-with-ngrok` flag to your `bentoml serve` CLI command, e.g.:

bash
bentoml serve IrisClassifier:latest --run-with-ngrok


Add support for CoreML (939)

Serving CoreML model on Mac OS is now supported! Users can also convert their models trained with other frameworks to the CoreML format, for better performance on Mac OS platforms. Here's an example with Pytorch model serving with CoreML and BentoML:

python
import torch
from torch import nn

class PytorchModel(nn.Module):
def __init__(self):
super().__init__()

self.linear = nn.Linear(5, 1, bias=False)
torch.nn.init.ones_(self.linear.weight)

def forward(self, x):
x = self.linear(x)

return x

------

import numpy
import pandas as pd

from coremltools.models import MLModel pylint: disable=import-error

import bentoml
from bentoml.adapters import DataframeInput
from bentoml.artifact import CoreMLModelArtifact

bentoml.env(auto_pip_dependencies=True)
bentoml.artifacts([CoreMLModelArtifact('model')])
class CoreMLClassifier(bentoml.BentoService):
bentoml.api(input=DataframeInput())
def predict(self, df: pd.DataFrame) -> float:
model: MLModel = self.artifacts.model
input_data = df.to_numpy().astype(numpy.float32)
output = model.predict({"input": input_data})
return next(iter(output.values())).item()


def convert_pytorch_to_coreml(pytorch_model: PytorchModel) -> ct.models.MLModel:
"""CoreML is not for training ML models but rather for converting pretrained models
and running them on Apple devices. Therefore, in this train we convert the
pretrained PytorchModel from the tests.integration.test_pytorch_model_artifact
module into a CoreML module."""
pytorch_model.eval()
traced_pytorch_model = torch.jit.trace(pytorch_model, torch.Tensor(test_df.values))
model: MLModel = ct.convert(
traced_pytorch_model, inputs=[ct.TensorType(name="input", shape=test_df.shape)]
)
return model


------

if __name__ == '__main__':
svc = CoreMLClassifier()
pytorch_model = PytorchModel()
model = convert_pytorch_to_coreml(pytorch_model)
svc.pack('model', model)
svc.save()


Breaking Change: Remove CLI --with-conda option 898

Run inference job within an automatically generated conda environment seems like a good idea at first but we realized it introduces more problems than it solves. We are removing this option and encourage users to use docker for running inference jobs instead.

Improvements:
* 966, 968 Faster `save` by improving python local module parsing code
* 878, 879 Faster `import bentoml` with lazy module loader
* 872 Add BentoService API name validation
* 887 Set a smaller page limit for `bentoml list`
* 916 Do not cache pip requirements in Dockerfile
* 918 Improve error handling when micro batching service is unavailable
* 925 Artifact refactoring: set_dependencies method
* 932 Add warning for SavedBundle Python version mismatch
* 904 JsonInput handle AWS Lambda event should ignore content type header
* 951 Add openjdk to H2O artifact default conda dependencies
* 958 Fix typo in cli default argument help message

Bug fixes:

* 864 Fix decode headers with latin1
* 867 Fix DataFrameInput passing NaN values over HTTP JSON request
* 869 Change the default mb_max_latency value to avoid flaky micro-batching initialization
* 897 Fix yatai web client import
* 907 Fix CORS option in AWS Lambda SAM config
* 922 Fix lambda deployment when using AWS assumed-role ARN
* 959 Fix `RecursionError: maximum recursion depth exceeded` when saving BentoService bundle
* 969 Fix error in CLI command `bentoml --version`

Internal & Testing

* 870 Add docs for using BentoML's built-in benchmark client
* 855, 871, 877 Add integration tests for dockerized BentoML API server workflow
* 876, 937 Add integration test for Tensorflow SavedModel artifact
* 951 H2O artifact integration test
* 939 CoreML artifact integration test
* 865 add makefile for BentoML developers
* 868 API Server "/feedback" endpoint refactor
* 908 BentoService base class refactoring and docstring improvements
* 909 Refactor API Server startup
* 910 Refactor API server performance tracing
* 906 Fix yatai web ui startup script
* 875 Increate micro batching server test coverage
* 935 Fix list deployments error response

Community Announcements:

We have enabled __Github Discussions__ https://github.com/bentoml/BentoML/discussions feature🎉

This will be a new place for community members to connect, ask questions, and share anything related to model serving and BentoML.

Contributors

Thank you, everyone, for contributing to this amazing release loaded with new features and improvements! bojiang joshuacwnewton guy4261 Sharathmk99 co42 jackyzha0 Korusuke akainth015 omrihar yubozhao

0.8.3

* Fix: 500 Error without message when micro-batch enabled 857
* Fix: port conflict with --debug flag 858
* Permission issue while building docker image for BentoService created under Windows OS 851

Page 14 of 21

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.