**New features**
* Support configurable `pre_stop` command for containers https://github.com/cortexlabs/cortex/pull/2403 ([docs](https://docs.cortex.dev/workloads/realtime/configuration)) ([deliahu](https://github.com/deliahu))
**Misc**
* Support m6i instance types https://github.com/cortexlabs/cortex/pull/2398 ([deliahu](https://github.com/deliahu))
* Update to Kubernetes v1.21 https://github.com/cortexlabs/cortex/pull/2398 ([deliahu](https://github.com/deliahu))
**Bug fixes**
* Wait for in-flight requests to reach zero before terminating the proxy container https://github.com/cortexlabs/cortex/pull/2402 ([deliahu](https://github.com/deliahu))
* Fix `cortex get --env` command https://github.com/cortexlabs/cortex/pull/2404 ([deliahu](https://github.com/deliahu))
* Fix cluster price estimate during `cortex cluster up` for spot node groups with on-demand base capacity https://github.com/cortexlabs/cortex/pull/2406 ([RobertLucian](https://github.com/RobertLucian))
Nucleus Model Server
We have released v0.1.0 of the [Nucleus model server](https://github.com/cortexlabs/nucleus)!
Nucleus is a model server for TensorFlow and generic Python models. It is compatible with Cortex clusters, Kubernetes clusters, and any other container-based deployment platforms. Nucleus can also be run locally via Docker compose.
Some of Nucleus's features include:
* Generic Python models (PyTorch, ONNX, Sklearn, MLFlow, Numpy, Pandas, etc)
* TensorFlow models
* CPU and GPU support
* Serve models directly from S3 paths
* Configurable multiprocessing and multithreadding
* Multi-model endpoints
* Dynamic server-side request batching
* Automatic model reloading when new model versions are uploaded to S3
* Model caching based on LRU policy (on disk and memory)
* HTTP and gRPC support