* Support for version 4.0.0 of D3M dataset schema has been added.
* D3M core package now supports loading directly datasets from OpenML.
* When saving `Dataset` object to D3M dataset format, metadata is now preserved.
* NetworkX objects are not anymore container types and are not allowed
anymore to be passed as values between primitives.
* "Meta" files are not supported anymore by the runtime. Instead save a
pipeline run with configuration of the run you want, and use the pipeline
run to re-run using that configuration.
Enhancements
* Primitive family `REMOTE_SENSING` has been added.
[!310](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/310)
* Added support for version 4.0.0 of D3M dataset schema:
* There are no more `NODE` and `EDGE` references (used in graph datasets),
but only `NODE_ATTRIBUTE` and `EDGE_ATTRIBUTE`.
* `time_granularity` can now be present on a column.
* `forecasting_horizon` can now be present in a problem description.
* `task_type` and `task_subtype` have been merged into `task_keywords`.
As a consequence, Python `TaskType` and `TaskSubtype` were replaced
with `TaskKeyword`.
[401](https://gitlab.com/datadrivendiscovery/d3m/issues/401)
[!310](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/310)
**Backwards incompatible.**
* Added OpenML dataset loader. Now you can pass an URL to a OpenML dataset
and it will be downloaded and converted to a `Dataset` compatible object,
with including many of available meta-features. Combined with support
for saving datasets, this now allows easy conversion between OpenML
datasets and D3M datasets, e.g., `python3 -m d3m dataset convert -i https://www.openml.org/d/61 -o out/datasetDoc.json`.
[252](https://gitlab.com/datadrivendiscovery/d3m/issues/252)
[!309](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/309)
* When saving and loading D3M datasets, metadata is now preserved.
[227](https://gitlab.com/datadrivendiscovery/d3m/issues/227)
[!265](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/265)
* Metadata can now be converted to a JSON compatible structure in a
reversible manner.
[373](https://gitlab.com/datadrivendiscovery/d3m/issues/373)
[!308](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/308)
* Pipeline run now records if a pipeline was run as a standard pipeline
under `run.is_standard_pipeline` field.
[396](https://gitlab.com/datadrivendiscovery/d3m/issues/396)
[!249](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/249)
* "meta" files have been replaced with support for rerunning pipeline runs.
Instead of configuring a "meta" file with configuration how to run a
pipeline, simply provide an example pipeline run which demonstrates how
the pipeline was run. Runtime does not have `--meta` argument anymore,
but has now `--input-run` argument instead.
[202](https://gitlab.com/datadrivendiscovery/d3m/issues/202)
[!249](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/249)
**Backwards incompatible.**
* Changed `LossFunctionMixin` to support multiple loss functions.
[386](https://gitlab.com/datadrivendiscovery/d3m/issues/386)
[!305](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/305)
**Backwards incompatible.**
* Pipeline equality and hashing functions now have `only_control_hyperparams`
argument which can be set to use only control hyper-parameters when doing
comparisons.
[!289](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/289)
* Pipelines and other YAML files are now recognized with both `.yml` and
`.yaml` file extensions.
[375](https://gitlab.com/datadrivendiscovery/d3m/issues/375)
[!302](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/302)
* `F1Metric`, `F1MicroMetric`, and `F1MacroMetric` can now operate on
multiple target columns and average scores for all of them.
[400](https://gitlab.com/datadrivendiscovery/d3m/issues/400)
[!298](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/298)
* Pipelines and pipeline runs can now be serialized with Arrow.
[381](https://gitlab.com/datadrivendiscovery/d3m/issues/381)
[!290](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/290)
* `describe` CLI commands now accept `--output` argument to control where
their output is saved to.
[!279](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/279)
Bugfixes
* Made exposed outputs be stored even in the case of an exception.
[380](https://gitlab.com/datadrivendiscovery/d3m/issues/380)
[!304](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/304)
* Fixed `source.from` metadata in datasets and problem descriptions
and its validation for metalearning database.
[363](https://gitlab.com/datadrivendiscovery/d3m/issues/363)
[!303](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/303)
* Fixed pipeline run references when running the runtime through
evaluation command.
[395](https://gitlab.com/datadrivendiscovery/d3m/issues/395)
[!294](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/294)
* The core package scoring primitive has been updated to have digest.
This allows the core package scoring pipeline to have it as well.
This changes makes it required for the core package to be installed
in editable mode (`pip3 install -e ...`) when being installed from the
git repository.
[!280](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/280)
**Backwards incompatible.**
Other
* Few top-level runtime functions had some of their arguments moved
to keyword-only arguments:
* `fit`: `problem_description`
* `score`: `scoring_pipeline`, `problem_description`, `metrics`, `predictions_random_seed`
* `prepare_data`: `data_pipeline`, `problem_description`, `data_params`
* `evaluate`: `data_pipeline`, `scoring_pipeline`, `problem_description`, `data_params`, `metrics`
[352](https://gitlab.com/datadrivendiscovery/d3m/issues/352)
[!301](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/301)
**Backwards incompatible.**
* `can_accept` method has been removed from primitive interfaces.
[334](https://gitlab.com/datadrivendiscovery/d3m/issues/334)
[!300](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/300)
**Backwards incompatible.**
* NetworkX objects are not anymore container types and are not allowed
anymore to be passed as values between primitives. Dataset loader now
does not convert a GML file to a NetworkX object but represents it
as a files collection resource. A primitive should then convert that
resource into a normalized edge-list graph representation.
[349](https://gitlab.com/datadrivendiscovery/d3m/issues/349)
[!299](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/299)
**Backwards incompatible.**
* `JACCARD_SIMILARITY_SCORE` metric is now a binary metric and requires
`pos_label` parameter.
[!299](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/299)
**Backwards incompatible.**
* Updated core dependencies. Some important packages are now at versions:
* `tensorflow`: 2.0.0
* `keras`: 2.3.1
* `torch`: 1.3.0.post2
* `theano`: 1.0.4
* `scikit-learn`: 0.21.3
* `numpy`: 1.17.3
* `pandas`: 0.25.2
* `networkx`: 2.4
* `pyarrow`: 0.15.1
* `scipy`: 1.3.1
[398](https://gitlab.com/datadrivendiscovery/d3m/issues/398)
[379](https://gitlab.com/datadrivendiscovery/d3m/issues/379)
[!299](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/299)
* Primitive family `DIMENSIONALITY_REDUCTION` has been added.
[!284](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/284)
* Added to `algorithm_types`:
* `POLYNOMIAL_REGRESSION`
* `IMAGENET`
* `RETINANET`
[!306](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/306)
* `--process-dependency-link` is not anymore suggested to be used when
installing primitives.
* `sample_rate` metadata field inside `dimension` has been renamed to
`sampling_rate` to make it consistent across metadata. This field
should contain a sampling rate used for the described dimension,
when values in the dimension are sampled.
**Backwards incompatible.**