MLflow 1.0 includes many significant features and improvements. From this version, MLflow is no longer beta, and all APIs except those marked as experimental are intended to be stable until the next major version. As such, this release includes a number of breaking changes.
Major features, improvements, and breaking changes
- Support for recording, querying, and visualizing metrics along a new “step” axis (x coordinate), providing increased flexibility for examining model performance relative to training progress. For example, you can now record performance metrics as a function of the number of training iterations or epochs. MLflow 1.0’s enhanced metrics UI enables you to visualize the change in a metric’s value as a function of its step, augmenting MLflow’s existing UI for plotting a metric’s value as a function of wall-clock time. (1202, 1237, dbczumar; 1132, 1142, 1143, smurching; 1211, 1225, Zangr; 1372, stbof)
- Search improvements. MLflow 1.0 includes additional support in both the API and UI for searching runs within a single experiment or a group of experiments. The search filter API supports a simplified version of the `SQL WHERE` clause. In addition to searching using run's metrics and params, the API has been enhanced to support a subset of run attributes as well as user and [system tags](https://mlflow.org/docs/latest/tracking.html#system-tags). For details see [Search syntax](https://mlflow.org/docs/latest/search-syntax.html#syntax) and [examples for programmatically searching runs](https://mlflow.org/docs/latest/search-syntax.html#programmatically-searching-runs). (1245, 1272, 1323, 1326, mparkhe; 1052, Zangr; 1363, aarondav)
- Logging metrics in batches. MLflow 1.0 now has a `runs/log-batch` REST API endpoint for logging multiple metrics, params, and tags in a single API request. The endpoint useful for performant logging of multiple metrics at the end of a model training epoch (see [example](https://github.com/mlflow/mlflow/blob/bb8c7602dcb6a3a8786301fe6b98f01e8d3f288d/examples/hyperparam/search_hyperopt.py#L161)), or logging of many input model parameters at the start of training. You can call this batched-logging endpoint from Python (`mlflow.log_metrics`, `mlflow.log_params`, `mlflow.set_tags`), R (`mlflow_log_batch`), and Java (`MlflowClient.logBatch`). (1214, dbczumar; see 0.9.1 and 0.9.0 for other changes)
- Windows support for MLflow Tracking. The Tracking portion of the MLflow client is now supported on Windows. (1171, eedeleon, tomasatdatabricks)
- HDFS support for artifacts. Hadoop artifact repository with Kerberos authorization support was added, so you can use HDFS to log and retrieve models and other artifacts. (1011, jaroslawk)
- CLI command to build Docker images for serving. Added an `mlflow models build-docker` CLI command for building a Docker image capable of serving an MLflow model. The model is served at port 8080 within the container by default. Note that this API is experimental and does not guarantee that the arguments nor format of the Docker container will remain the same. (1329, smurching, tomasatdatabricks)
- New `onnx` model flavor for saving, loading, and evaluating ONNX models with MLflow. ONNX flavor APIs are available in the `mlflow.onnx` module. (1127, avflor, dbczumar; 1388, 1389, dbczumar)
- Major breaking changes:
- Some of the breaking changes involve database schema changes in the SQLAlchemy tracking store. If your database instance's schema is not up-to-date, MLflow will issue an error at the start-up of `mlflow server` or `mlflow ui`. To migrate an existing database to the newest schema, you can use the `mlflow db upgrade` CLI command. (1155, 1371, smurching; 1360, aarondav)
- [Installation] The MLflow Python package no longer depends on `scikit-learn`, `mleap`, or `boto3`. If you want to use the `scikit-learn` support, the `MLeap` support, or `s3` artifact repository / `sagemaker` support, you will have to install these respective dependencies explicitly. (1223, aarondav)
- [Artifacts] In the Models API, an artifact's location is now represented as a URI. See the [documentation](https://mlflow.org/docs/latest/tracking.html#artifact-locations) for the list of accepted URIs. (1190, 1254, dbczumar; 1174, dbczumar, sueann; 1206, tomasatdatabricks; 1253, stbof)
- The affected methods are:
- Python: `<model-type>.load_model`, `azureml.build_image`, `sagemaker.deploy`, `sagemaker.run_local`, `pyfunc._load_model_env`, `pyfunc.load_pyfunc`, and `pyfunc.spark_udf`
- R: `mlflow_load_model`, `mlflow_rfunc_predict`, `mlflow_rfunc_serve`
- CLI: `mlflow models serve`, `mlflow models predict`, `mlflow sagemaker`, `mlflow azureml` (with the new `--model-uri` option)
- To allow referring to artifacts in the context of a run, MLflow introduces a new URI scheme of the form `runs:/<run_id>/relative/path/to/artifact`. (1169, 1175, sueann)
- [CLI] `mlflow pyfunc` and `mlflow rfunc` commands have been unified as `mlflow models` (1257, tomasatdatabricks; 1321, dbczumar)
- [CLI] `mlflow artifacts download`, `mlflow artifacts download-from-uri` and `mlflow download` commands have been consolidated into `mlflow artifacts download` (1233, sueann)
- [Runs] Expose `RunData` fields (`metrics`, `params`, `tags`) as dictionaries. Note that the `mlflow.entities.RunData` constructor still accepts lists of `metric`/`param`/`tag` entities. (1078, smurching)
- [Runs] Rename `run_uuid` to `run_id` in Python, Java, and REST API. Where necessary, MLflow will continue to accept `run_uuid` until MLflow 1.1. (1187, aarondav)
Other breaking changes
CLI:
- The `--file-store` option is deprecated in `mlflow server` and `mlflow ui` commands. (1196, smurching)
- The `--host` and `--gunicorn-opts` options are removed in the `mlflow ui` command. (1267, aarondav)
- Arguments to `mlflow experiments` subcommands, notably `--experiment-name` and `--experiment-id` are now options (1235, sueann)
- `mlflow sagemaker list-flavors` has been removed (1233, sueann)
Tracking:
- The `user` property of `Run`s has been moved to tags (similarly, the `run_name`, `source_type`, `source_name` properties were moved to tags in 0.9.0). (1230, acroz; 1275, 1276, aarondav)
- In R, the return values of experiment CRUD APIs have been updated to more closely match the REST API. In particular, `mlflow_create_experiment` now returns a string experiment ID instead of an experiment, and the other APIs return NULL. (1246, smurching)
- `RunInfo.status`'s type is now string. (1264, mparkhe)
- Remove deprecated `RunInfo` properties from `start_run`. (1220, aarondav)
- As deprecated in 0.9.1 and before, the `RunInfo` fields `run_name`, `source_name`, `source_version`, `source_type`, and `entry_point_name` and the `SearchRuns` field `anded_expressions` have been removed from the REST API and Python, Java, and R tracking client APIs. They are still available as tags, documented in the REST API documentation. (1188, aarondav)
Models and deployment:
- In Python, require arguments as keywords in `log_model`, `save_model` and `add_to_model` methods in the `tensorflow` and `mleap` modules to avoid breaking changes in the future (1226, sueann)
- Remove the unsupported `jars` argument from ``spark.log_model` in Python (1222, sueann)
- Introduce `pyfunc.load_model` to be consistent with other Models modules. `pyfunc.load_pyfunc` will be deprecated in the near future. (1222, sueann)
- Rename `dst_path` parameter in `pyfunc.save_model` to `path` (1221, aarondav)
- R flavors refactor (1299, kevinykuo)
- `mlflow_predict()` has been added in favor of `mlflow_predict_model()` and `mlflow_predict_flavor()` which have been removed.
- `mlflow_save_model()` is now a generic and `mlflow_save_flavor()` is no longer needed and has been removed.
- `mlflow_predict()` takes `...` to pass to underlying predict methods.
- `mlflow_load_flavor()` now has the signature `function(flavor, model_path)` and flavor authors should implement `mlflow_load_flavor.mlflow_flavor_{FLAVORNAME}`. The flavor argument is inferred from the inputs of user-facing `mlflow_load_model()` and does not need to be explicitly provided by the user.
Projects:
- Remove and rename some `projects.run` parameters for generality and consistency. (1222, sueann)
- In R, the `mlflow_run` API for running MLflow projects has been modified to more closely reflect the Python `mlflow.run` API. In particular, the order of the `uri` and `entry_point` arguments has been reversed and the `param_list` argument has been renamed to `parameters`. (1265, smurching)
R:
- Remove `mlflow_snapshot` and `mlflow_restore_snapshot` APIs. Also, the `r_dependencies` argument used to specify the path to a packrat r-dependencies.txt file has been removed from all APIs. (1263, smurching)
- The `mlflow_cli` and `crate` APIs are now private. (1246, smurching)
Environment variables:
- Prefix environment variables with "MLFLOW\_" (1268, aarondav). Affected variables are:
- [Tracking] `_MLFLOW_SERVER_FILE_STORE`, `_MLFLOW_SERVER_ARTIFACT_ROOT`, `_MLFLOW_STATIC_PREFIX`
- [SageMaker] `MLFLOW_SAGEMAKER_DEPLOY_IMG_URL`, `MLFLOW_DEPLOYMENT_FLAVOR_NAME`
- [Scoring] `MLFLOW_SCORING_SERVER_MIN_THREADS`, `MLFLOW_SCORING_SERVER_MAX_THREADS`
More features and improvements
- [Tracking] Non-default driver support for SQLAlchemy backends: `db+driver` is now a valid tracking backend URI scheme (1297, drewmcdonald; 1374, mparkhe)
- [Tracking] Validate backend store URI before starting tracking server (1218, luke-zhu, sueann)
- [Tracking] Add `GetMetricHistory` client API in Python and Java corresponding to the REST API. (1178, smurching)
- [Tracking] Add `view_type` argument to `MlflowClient.list_experiments()` in Python. (1212, smurching)
- [Tracking] Dictionary values provided to `mlflow.log_params` and `mlflow.set_tags` in Python can now be non-string types (e.g., numbers), and they are automatically converted to strings. (1364, aarondav)
- [Tracking] R API additions to be at parity with REST API and Python (1122, kevinykuo)
- [Tracking] Limit number of results returned from `SearchRuns` API and UI for faster load (1125, mparkhe; 1154, andrewmchen)
- [Artifacts] To avoid having many copies of large model files in serving, `ArtifactRepository.download_artifacts` no longer copies local artifacts (1307, andrewmchen; 1383, dbczumar)
- [Artifacts/Projects] Support GCS in download utilities. `gs://bucket/path` files are now supported by the `mlflow artifacts download` CLI command and as parameters of type `path` in MLProject files. (1168, drewmcdonald)
- [Models] All Python models exported by MLflow now declare `mlflow` as a dependency by default. In addition, we introduce a flag `--install-mlflow` users can pass to `mlflow models serve` and `mlflow models predict` methods to force installation of the latest version of MLflow into the model's environment. (1308, tomasatdatabricks)
- [Models] Update model flavors to lazily import dependencies in Python. Modules that define Model flavors now import extra dependencies such as `tensorflow`, `scikit-learn`, and `pytorch` inside individual _methods_, ensuring that these modules can be imported and explored even if the dependencies have not been installed on your system. Also, the `DEFAULT_CONDA_ENVIRONMENT` module variable has been replaced with a `get_default_conda_env()` function for each flavor. (1238, dbczumar)
- [Models] It is now possible to pass extra arguments to `mlflow.keras.load_model` that will be passed through to `keras.load_model`. (1330, yorickvP)
- [Serving] For better performance, switch to `gunicorn` for serving Python models. This does not change the user interface. (1322, tomasatdatabricks)
- [Deployment] For SageMaker, use the uniquely-generated model name as the S3 bucket prefix instead of requiring one. (1183, dbczumar)
- [REST API] Add support for API paths without the `preview` component. The `preview` paths will be deprecated in a future version of MLflow. (1236, mparkhe)
Bug fixes and documentation updates
- [Tracking] Log metric timestamps in milliseconds by default (1177, smurching; 1333, dbczumar)
- [Tracking] Fix bug when deserializing integer experiment ID for runs in `SQLAlchemyStore` (1167, smurching)
- [Tracking] Ensure unique constraint names in MLflow tracking database (1292, smurching)
- [Tracking] Fix base64 encoding for basic auth in R tracking client (1126, freefrag)
- [Tracking] Correctly handle `file:` URIs for the `-—backend-store-uri` option in `mlflow server` and `mlflow ui` CLI commands (1171, eedeleon, tomasatdatabricks)
- [Artifacts] Update artifact repository download methods to return absolute paths (1179, dbczumar)
- [Artifacts] Make FileStore respect the default artifact location (1332, dbczumar)
- [Artifacts] Fix `log_artifact` failures due to existing directory on FTP server (1327, kafendt)
- [Artifacts] Fix GCS artifact logging of subdirectories (1285, jason-huling)
- [Projects] Fix bug not sharing `SQLite` database file with Docker container (1347, tomasatdatabricks; 1375, aarondav)
- [Java] Mark `sendPost` and `sendGet` as experimental (1186, aarondav)
- [Python/CLI] Mark `azureml.build_image` as experimental (1222, 1233 sueann)
- [Docs] Document public MLflow environment variables (1343, aarondav)
- [Docs] Document MLflow system tags for runs (1342, aarondav)
- [Docs] Autogenerate CLI documentation to include subcommands and descriptions (1231, sueann)
- [Docs] Update run selection description in `mlflow_get_run` in R documentation (1258, dbczumar)
- [Examples] Update examples to reflect API changes (1361, tomasatdatabricks; 1367, mparkhe)
Small bug fixes and doc updates (1359, 1350, 1331, 1301, 1270, 1271, 1180, 1144, 1135, 1131, 1358, 1369, 1368, 1387, aarondav; 1373, akarloff; 1287, 1344, 1309, stbof; 1312, hchiuzhuo; 1348, 1349, 1294, 1227, 1384, tomasatdatabricks; 1345, withsmilo; 1316, ancasarb; 1313, 1310, 1305, 1289, 1256, 1124, 1097, 1162, 1163, 1137, 1351, smurching; 1319, 1244, 1224, 1195, 1194, 1328, dbczumar; 1213, 1200, Kublai-Jing; 1304, 1320, andrewmchen; 1311, Zangr; 1306, 1293, 1147, mateiz; 1303, gliptak; 1261, 1192, eedeleon; 1273, 1259, kevinykuo; 1277, 1247, 1243, 1182, 1376, mparkhe; 1210, vgod-dbx; 1199, ashtuchkin; 1176, 1138, 1365, sueann; 1157, cclauss; 1156, clemens-db; 1152, pogil; 1146, srowen; 875, 1251, jimthompson5802)