InferenceService V1Beta1
:ship: KFServing 0.5 promotes the core InferenceService from v1alpha2 to v1beta1!
The minimum required versions are Kubernetes 1.16 and Istio 1.3.1/Knative 0.14.3. Conversion webhook is installed to automatically convert v1alpha2 inference service to v1beta1.
:new: What's new ?
- You can now specify container fields on ML Framework spec such as env variable, liveness/readiness probes etc.
- You can now specify pod template fields on component spec such as NodeAffinity etc.
- Allow specifying timeouts on component spec
- Tensorflow Serving [gRPC support](https://github.com/kubeflow/kfserving/tree/master/docs/samples/v1beta1/tensorflow#create-the-inferenceservice-with-grpc).
- Triton Inference server V2 inference REST/gRPC protocol support, see [examples](https://github.com/kubeflow/kfserving/tree/master/docs/samples/v1beta1/triton)
- TorchServe [predict integration](https://pytorch.org/serve/inference_api.html#kfserving-inference-api), see [examples](https://github.com/kubeflow/kfserving/tree/master/docs/samples/v1beta1/torchserve)
- SKLearn/XGBoost V2 inference REST/gRPC protocol support with MLServer, see [SKLearn](https://github.com/kubeflow/kfserving/tree/master/docs/samples/v1beta1/sklearn) and [XGBoost](https://github.com/kubeflow/kfserving/tree/master/docs/samples/v1beta1/xgboost) examples
- PMMLServer support, see [examples](https://github.com/kubeflow/kfserving/tree/master/docs/samples/v1beta1/pmml)
- LightGBM support, see [examples](https://github.com/kubeflow/kfserving/tree/master/docs/samples/v1beta1/lightgbm)
- Simplified canary rollout, traffic split at knative revisions level instead of services level, see [examples](https://github.com/kubeflow/kfserving/tree/master/docs/samples/v1beta1/rollout)
- Transformer to predictor call is now using AsyncIO by default
:warning: What's gone ?
- Default/Canary level is removed, canaryTrafficPercent is moved to the component level
- `rollout_canary` and `promote_canary` API is deprecated on KFServing SDK
- Parallelism field is renamed to containerConcurrency
- `Custom` keyword is removed and `container` field is changed to be an array
:arrow_up: What actions are needed to take to upgrade?
- Make sure canary traffic is all rolled out before upgrade as v1alpha2 canary spec is deprecated, please use v1beta1 spec for canary rollout feature.
- Although KFServing automatically converts the InferenceService to v1beta1, we recommend rewriting all your spec with V1beta1 API as we plan to drop the support for v1alpha2 in later versions.
Contribution list
* Make KFServer HTTP requests asynchronous 983 by salanki
* Add support for generic HTTP/HTTPS URI for Storage Initializer 979 by tduffy000
*
InferenceService v1beta1 API 991 by yuzisun
* Validation check for InferenceService Name 1079 by jazzsir
* Set KFServing default worker to 1 1106 by yuzliu
* Add support for MLServer in the SKLearn predictor 1155 by adriangonz
* Add V2 support to XGBoost predictor 1196 by adriangonz
* Support PMML server 1141 by AnyISalIn
* Generate SDK for KFServing v1beta1 1150 by jinchihe
* Support Kubernetes 1.18 1128 by pugangxa
* Integrate TorchServe to v1beta1 spec 1161 by jagadeeshi2i
* Merge batcher to model agent 1287 by yuzisun
* Fix torchserve protocol version and update doc 1271 1277
* Support CloudEvent(Avro/Protobuf) for KFServer 1343 mtickoobb
Multi Model Serving V1Alpha1
:rainbow: KFServing 0.5 introduces Multi Model Serving with V1Alpha1 TrainedModel CR, this is currently for experiment only and we are looking for your feedbacks!
Checkout [sklearn](https://github.com/kubeflow/kfserving/tree/master/docs/samples/v1beta1/sklearn/multimodel), [triton](https://github.com/kubeflow/kfserving/tree/master/docs/samples/v1beta1/triton/multimodel) MMS examples.
* Multi-Model Puller 989 by ifilonenko
* Add multi model configmap 992 by wengyao04
* Trained model v1alpha1 api 1009 by yuzliu
* TrainedModel controller 1013 by yuzliu
* Harden model puller logic and add tests 1055 by yuzisun
* Puller streamlining/simplification 1057 by njhill
*
Integrate MMS inferenceservice controller, configmap controller, model agent 1132 by yuzliu
* Add load/unload endpoint for SKLearn/XGBoost KFServer 1082 by wengyao04
* Sync from model config on agent startup 1204 by yuzisun
* Fix model puller flag for MMS 1281 by yuzisun
* TrainedModel status url 1319 by abchoo
* Add MMS support for SKLearn/XGBoost MLServer 1290 adriangonz
* Support GCS for model agent 1105 mszacillo
Explanation
*
Add support for AIX360 explanations 1094 by drewbutlerbb4
* Alibi 0.5.5 1168 by cliveseldon
* Adversarial robustness explainer(ART) 1244 by drewbutlerbb4
* PyTorch Captum [explain integration](https://pytorch.org/serve/inference_api.html#kfserving-explanations-api), see [example](https://github.com/kubeflow/kfserving/tree/master/docs/samples/v1beta1/torchserve/bert#captum-explanations)
Documentation
* Docs/custom domain 1036 by adamkgray
* Update ingress gateway access instruction 1008 by yuzisun
* Document working k8s version 1062 by riklopfer
* Add triton torchscript example with prediction v2 protocol 1131 by yuzisun
* Add torchserve custom server with pv storage example 1182 by jagadeeshi2i
* Add torchserve custom server example 1156 by jagadeeshi2i
*
Add torchserve custom server bert sample 1185 by jagadeeshi2i
* Bump up minimal Kube and Istio requirements 1166 by animeshsingh
* V1beta1 canary rollout examples 1267 by yuzisun
* Promethus based metrics and monitoring docs 1276 by sriumcp
Developer Experience
* Migrate controller tests to use BDD testing style 936 by yuzisun
*
Genericized component logic. 1018 by ellistarn
* Use github action for kfserving controller tests 1056 by yuzisun
* Make standalone installation kustomizable 1103 by jazzsir
*
Move KFServing CI to AWS 1170 by yuzisun
* Upgrade k8s and kn go library versions 1144 by ryandawsonuk
* Add e2e test for torchserve 1265 by jagadeeshi2i
* Add e2e test for SKLearn/XGBoost MMS 1306 by abchoo
* Upgrade k8s client library to 1.19 1305 by ivan-valkov
* Upgrade controller-runtime to 0.7.0 1341 by pugangxa