We are excited to announce the release of BentoML 1.3! Following the feedback received since the launch of 1.2 earlier this year, we are introducing a host of new features and enhancements in 1.3. Below are the key highlights of 1.3 and stay tuned for an upcoming blog post, where we'll provide a detailed exploration of the new features and the driving forces behind the development.
Here are some of the important points to note about 1.3:
- `1.3` ensures full backward compatibility, meaning that Bentos built with `1.2` will continue to work seamlessly with this release.
- We remain committed to supporting `1.2`. Critical bug fixes and security updates will be backported to the `1.2` branch.
- The [BentoML documentation](https://docs.bentoml.com/en/latest/index.html) has been updated with examples and guides for `1.3`. More guides will be added in the coming weeks.
- BentoCloud supports Bento Deployments from both `1.2` and `1.3` releases of BentoML.
Now, let’s take a look at the major features and enhancements:
🕙 Implemented BentoML task execution
- Introduced the `bentoml.task` decorator to set a task endpoint for executing a long-running workload (such as batch processing or video generation).
- Added the `.submit()` method to both the sync and async clients, which can submit task inputs via the task endpoint and dedicated worker processes constantly monitor task queues for new work to perform.
- Full compatibility with BentoCloud to run Bentos defined with task endpoints.
- See the [Services](https://docs.bentoml.com/en/latest/guides/services.html) and [Clients](https://docs.bentoml.com/en/latest/guides/clients.html) doc with examples of a Service API by initializing a long running task in the Service constructor, creating clients to call the endpoint, and retrieving task status.
🚀 Optimized the build cache to accelerate the build process
- Enhanced build speed for `bentoml build` & `containerize` through pre-installed large packages like `torch`
- Switch to `uv` as the installer and resolver, replacing `pip`
🔨 Supported concurrency-based autoscaling on BentoCloud
- Added the `concurrency` configuration to the `bentoml.service` decorator to set the ideal number of simultaneous requests a Service is designed to handle.
- Added the `external_queue` configuration to the `bentoml.service` decorator to queue excess requests until they can be processed within the defined `concurrency` limits.
- See the [documentation](https://docs.bentoml.com/en/latest/bentocloud/how-tos/autoscaling.html) to configure concurrency and external queue.
🔒 Secure data handling with [secrets](https://docs.bentoml.com/en/latest/bentocloud/how-tos/manage-secrets.html) in BentoCloud:
- You can now create and manage credentials, such as HuggingFace tokens and AWS secrets, securely on BentoCloud and easily apply them across multiple Deployments.
- Added secret subcommands to the BentoML CLI for secret management. Run `bentoml secret -h` to learn more.
🗒️ Added streamed logs for Bento image deployment.
- Easier to troubleshoot build issues and enable faster development iterations
🙏 Thank you for your continued support!
What's Changed
* fix: change forbid extra keys to false for bentocloud by FogDong in https://github.com/bentoml/BentoML/pull/4866
* feat(dev): 1.3 by frostming in https://github.com/bentoml/BentoML/pull/4849
* fix: delete cluster and ns if it is first cluster by FogDong in https://github.com/bentoml/BentoML/pull/4869
* fix: auto login confirm ask logic by xianml in https://github.com/bentoml/BentoML/pull/4864
* fix: secret default value by xianml in https://github.com/bentoml/BentoML/pull/4870
* fix: fix typo in error msg by FogDong in https://github.com/bentoml/BentoML/pull/4871
**Full Changelog**: https://github.com/bentoml/BentoML/compare/v1.2.20...v1.3.0