**New features**
* Multi-model caching: serve a collection of models that is collectively bigger than what will fit in memory (via LRU cache eviction) ([docs](https://docs.cortex.dev/v/0.22/deployments/realtime-api/models#multi-model-caching)) https://github.com/cortexlabs/cortex/pull/1428 https://github.com/cortexlabs/cortex/issues/619 ([RobertLucian](https://github.com/RobertLucian))
* Live reloading: support updating models in running APIs by adding new versions to the model's S3 directory ([docs](https://docs.cortex.dev/v/0.22/deployments/realtime-api/models#live-model-reloading)) https://github.com/cortexlabs/cortex/pull/1428 https://github.com/cortexlabs/cortex/issues/1252 ([RobertLucian](https://github.com/RobertLucian))
* Inter-process fairness: distribute requests within an API replica evenly across all processes https://github.com/cortexlabs/cortex/pull/1526 https://github.com/cortexlabs/cortex/issues/839 https://github.com/cortexlabs/cortex/issues/1298 ([RobertLucian](https://github.com/RobertLucian))
* Support requests between APIs within the same cluster ([docs](https://docs.cortex.dev/v/0.22/deployments/realtime-api/predictors#chaining-apis)) https://github.com/cortexlabs/cortex/pull/1503 https://github.com/cortexlabs/cortex/issues/1241 ([deliahu](https://github.com/deliahu))
* Allow overriding of CLI install path and config directory (via `$CORTEX_INSTALL_PATH` and `$CORTEX_CLI_CONFIG_DIR`) ([docs](https://docs.cortex.dev/v/0.22/miscellaneous/cli#mac-linux-os)) https://github.com/cortexlabs/cortex/pull/1521 https://github.com/cortexlabs/cortex/issues/1222 ([deliahu](https://github.com/deliahu))
**Breaking changes**
* ONNX model paths in API configuration files must now point to a directory containing a single ONNX file, rather than the onnx file itself. For example `model_path: s3://cortex-examples/onnx/yolov5-youtube/yolov5s.onnx` becomes `model_path: s3://cortex-examples/onnx/yolov5-youtube`.
* The `--env/-e` flag in all `cortex cluster` commands has been renamed to `--configure-env/-e`, and if not provided, the environment named `aws` will no longer be configured in the `cortex cluster info` command
**Bug fixes**
* Fix intermittent failed requests during rolling updates https://github.com/cortexlabs/cortex/pull/1526 https://github.com/cortexlabs/cortex/issues/814 ([RobertLucian](https://github.com/RobertLucian))
* Prevent CLI environments from getting overwritten when multiple `cortex cluster` commands are run concurrently https://github.com/cortexlabs/cortex/pull/1520 https://github.com/cortexlabs/cortex/issues/1410 ([deliahu](https://github.com/deliahu))
**Docs**
* Add [Python client docs](https://docs.cortex.dev/v/0.22/miscellaneous/python-client) https://github.com/cortexlabs/cortex/pull/1519 https://github.com/cortexlabs/cortex/issues/1502 ([deliahu](https://github.com/deliahu))
* Add guide for [running in production](https://docs.cortex.dev/v/0.22/guides/production) https://github.com/cortexlabs/cortex/pull/1513 https://github.com/cortexlabs/cortex/issues/1464 https://github.com/cortexlabs/cortex/issues/1257 ([deliahu](https://github.com/deliahu))
* Add guide for [low-cost clusters](https://docs.cortex.dev/v/0.22/guides/low-cost-clusters) https://github.com/cortexlabs/cortex/pull/1514 https://github.com/cortexlabs/cortex/issues/1425 ([deliahu](https://github.com/deliahu))
* Add guide for [using a REST API Gateway](https://docs.cortex.dev/v/0.22/guides/rest-api-gateway) https://github.com/cortexlabs/cortex/pull/1505 https://github.com/cortexlabs/cortex/issues/1228 ([deliahu](https://github.com/deliahu))
* Add guide for [troubleshooting `cortex cluster down` failures](https://docs.cortex.dev/v/0.22/troubleshooting/cluster-down) https://github.com/cortexlabs/cortex/pull/1515 https://github.com/cortexlabs/cortex/issues/1319 ([deliahu](https://github.com/deliahu))
**Misc**
* Stagger Predictor `__init__()` calls to reduce peak memory consumption https://github.com/cortexlabs/cortex/pull/1543 https://github.com/cortexlabs/cortex/issues/1450 ([RobertLucian](https://github.com/RobertLucian))
* Add `--name/-n` and `--region/-r` flags to `cortex cluster info`, `cortex cluster export`, and `cortex cluster down` commands https://github.com/cortexlabs/cortex/pull/1492 https://github.com/cortexlabs/cortex/issues/1363 ([RobertLucian](https://github.com/RobertLucian))
* Rename `--env/-e` flag to `--configure-env/-e` in `cortex cluster` commands and update its behavior https://github.com/cortexlabs/cortex/pull/1533 https://github.com/cortexlabs/cortex/issues/1412 ([deliahu](https://github.com/deliahu))
* Disallow ARM-based instances, which are not currently supported https://github.com/cortexlabs/cortex/pull/1536 ([deliahu](https://github.com/deliahu))
* Validate AWS vCPU quota is sufficient for up to `max_instances` instances when running `cortex cluster up` and `cortex cluster configure` https://github.com/cortexlabs/cortex/pull/1537 https://github.com/cortexlabs/cortex/issues/1461 ([deliahu](https://github.com/deliahu))