Tamu-d3m

Latest version: v2022.5.23

Safety actively analyzes 626904 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 3

2020.11.3

Enhancements

* Pipeline has now a new implementation of `get_exposable_outputs` method
(previously deprecated) which returns all possible data references one can expose
from a pipeline (not just those which are produced during regular execution).
Similarly, pipeline steps now have `get_exposable_data_references` method.
[!396](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/396)
* `Result` class of results from running a pipeline has now a
`get_standard_pipeline_output` helper method to return the output value of
a standard pipeline.
[!396](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/396)
* `Runtime.fit` and `Runtime.produce` methods' `return_values` parameter
now accepts also data references of an output which would otherwise not be produced
and it forces that output to be produced. This allows top-level `fit`
and `produce` functions to have a new `expose_outputs` parameter where
you can list all outputs you want exposed, even if they would otherwise not
be produced.
[294](https://gitlab.com/datadrivendiscovery/d3m/-/issues/294)
[!396](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/396)
[!408](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/408)
* Added support for version 4.1.1 of D3M dataset schema:
* Added `https://metadata.datadrivendiscovery.org/types/LocationPolygon`,
`https://metadata.datadrivendiscovery.org/types/BagKey`, and
`https://metadata.datadrivendiscovery.org/types/Band` semantic types.
[!406](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/406)
* Added `MULTIPLE_INSTANCE_LEARNING` task keyword.
[!406](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/406)
* Added `spatial_reference_system` metadata field.
[!406](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/406)
* Pipeline runs made using `evaluate` runtime command can now be all
rerun.
[407](https://gitlab.com/datadrivendiscovery/d3m/-/issues/407)
[!389](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/389)
* All runtime CLI commands now accept optional data preparation pipelines.
[286](https://gitlab.com/datadrivendiscovery/d3m/-/issues/286)
[!385](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/385)
* Deprecated `DATA_PREPROCESSING` and removed `DATA_WRANGLING` primitive
families. Consider using `DATA_TRANSFORMATION`, `DATA_CLEANING`, or `FEATURE_EXTRACTION` instead.
[329](https://gitlab.com/datadrivendiscovery/d3m/-/issues/329)
[!399](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/399)
[!400](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/400)
**Backwards incompatible.**
* Removed PyArrow dependency. Now you have to manually register
serializers/deserializers if you are using it.
[!393](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/393)
* Allow hiding false positive warnings for random sources, populated with a list
of known false positives.
[461](https://gitlab.com/datadrivendiscovery/d3m/-/issues/461)
[465](https://gitlab.com/datadrivendiscovery/d3m/-/issues/465)
[!387](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/387)
* Removed use of Pycurl.
[463](https://gitlab.com/datadrivendiscovery/d3m/-/issues/463)
[!384](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/384)
* Worker ID (used in pipeline runs to identify the machine on which the pipeline ran)
can now be provided using the environment variable `D3M_WORKER_ID`.
[278](https://gitlab.com/datadrivendiscovery/d3m/-/issues/278)
[!379](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/379)
* `RuntimeEnvironment` is now cached when not provided to the reference runtime.
[280](https://gitlab.com/datadrivendiscovery/d3m/-/issues/280)
[!380](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/380)

Bugfixes

* Fixed typing information for bound parameters in `Bounded` hyper-parameter class to correctly support `None` value.
[!405](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/405)
* Resolved a performance issue of accuracy, f1 micro, f1 macro, hamming loss,
and all ROC AUC metrics.
[484](https://gitlab.com/datadrivendiscovery/d3m/-/issues/484)
[!395](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/395)
* Fixed saving of small datasets.
[494](https://gitlab.com/datadrivendiscovery/d3m/-/issues/494)
[!397](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/397)
* Fixed how hyper-parameters are prepared for primitives passed as a hyper-parameter.
[!391](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/391)

Other

* `return_values` parameter of `Runtime.fit` and `Runtime.produce`
has been renamed to `outputs_to_expose`. Old name has been deprecated.
[499](https://gitlab.com/datadrivendiscovery/d3m/-/issues/499)
[!407](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/407)
* `confidence_for` metadata field has been renamed to `score_for`.
[496](https://gitlab.com/datadrivendiscovery/d3m/-/issues/496)
[!404](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/404)
**Backwards incompatible.**
* Removed semantic type `https://metadata.datadrivendiscovery.org/types/Confidence`.
Use `https://metadata.datadrivendiscovery.org/types/Score` instead
(optionally with addition of `https://metadata.datadrivendiscovery.org/types/Likelihood`
or `https://metadata.datadrivendiscovery.org/types/LogLikelihood`).
[496](https://gitlab.com/datadrivendiscovery/d3m/-/issues/496)
[!404](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/404)
**Backwards incompatible.**
* Added new semantic types:
* `https://metadata.datadrivendiscovery.org/types/Likelihood`
* `https://metadata.datadrivendiscovery.org/types/LogLikelihood`

[496](https://gitlab.com/datadrivendiscovery/d3m/-/issues/496)
[!404](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/404)

* Added `LogLikelihood` and `Likelihood` semantic types to definitions.
[496](https://gitlab.com/datadrivendiscovery/d3m/-/issues/496)
[!404](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/404)
* Bumped core package Pandas dependency upper bound to 1.1.3.
[495](https://gitlab.com/datadrivendiscovery/d3m/-/issues/495)
[!402](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/402)

2020.5.18

* Support for version 4.1.0 of D3M dataset schema has been added.
* ROC-AUC and multi-label metrics are now support. Mean reciprocal rank and hits at k metrics
have been added. One can now also register custom additional metrics.
* Reference runtime now does not keep primitive instances in memory anymore.

Enhancements

* Scoring primitive and pipeline now accept new hyper-parameter `all_labels`
which can be used to provide information about all labels possible in a target
column.
[431](https://gitlab.com/datadrivendiscovery/d3m/-/issues/431)
[!377](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/377)
* Added `all_distinct_values` metadata field which can contain all values (labels)
which can be in a column. This is meant to be used on target columns to help
implementing `ContinueFitMixin` in a primitive which might require knowledge
of all possible labels before starting fitting on a subset of data.
[447](https://gitlab.com/datadrivendiscovery/d3m/-/issues/447)
[!377](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/377)
* Reference runtime now does not keep primitive instances in memory anymore
but uses `get_params`/`set_params` to retain and reuse only primitive's parameters.
This makes memory usage lower and allows additional resource releasing when primitive's
object is freed (e.g., releasing GPUs).
[313](https://gitlab.com/datadrivendiscovery/d3m/-/issues/313)
[!376](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/376)
**Could be backwards incompatible.**
* Added support for version 4.1.0 of D3M dataset schema:
* Added `MONTHS` to column's `time_granularity` metadata.
[!340](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/340)
* Added mean reciprocal rank and hits at k metrics.
[!361](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/361)
* Added `https://metadata.datadrivendiscovery.org/types/Rank` semantic type
and `rank_for` metadata field. `PerformanceMetric` classes have now
`requires_rank` method.
[!372](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/372)
* Added `NESTED` task keyword.
[!372](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/372)
* Added `file_columns_count` metadata field and updated `file_columns` metadata field
with additional sub-fields. Also renamed sub-field `name` to `column_name` and added
`column_index` sub-fields to `file_columns` metadata.
[!372](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/372)
**Backwards incompatible.**
* Moved high-level primitive base classes for file readers and dataset splitting
from common primitives to d3m core package.
[!120](https://gitlab.com/datadrivendiscovery/common-primitives/-/merge_requests/120)
[!339](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/339)
* A warning is issued if a primitive uses a global random source
during pipeline execution. Such behavior can make pipeline
execution not reproducible.
[384](https://gitlab.com/datadrivendiscovery/d3m/-/issues/384)
[!365](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/365)
* CLI accepts `--logging-level` argument to configure which logging
messages are printed to the console.
[!360](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/360)
* Output to stdout/stderr during pipeline execution is now not suppressed
anymore, which makes it possible to debug pipeline execution using pdb.
Stdout/stderr is at the same time still logged to Python logging.
[270](https://gitlab.com/datadrivendiscovery/d3m/-/issues/270)
[!360](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/360)
* Redirect from stdout to Python logging now operates per lines and
not per write operations, makes logs more readable.
[168](https://gitlab.com/datadrivendiscovery/d3m/-/issues/168)
[!358](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/358)
* Made sure that multi-label metrics work correctly.
[370](https://gitlab.com/datadrivendiscovery/d3m/-/issues/370)
[!343](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/343)
* Implemented ROC AUC metrics. They require predictions to include
confidence for all possible labels.
[317](https://gitlab.com/datadrivendiscovery/d3m/-/issues/317)
[!318](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/318)
* Additional (non-standard) performance metrics can now be registered
using `PerformanceMetric.register_metric` class method.
[207](https://gitlab.com/datadrivendiscovery/d3m/-/issues/207)
[!348](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/348)
* All D3M enumerations can now be extended with additional values
through `register_value` class method. This allows one to add values
to existing standard values (which come from the metadata schema).
Internally, enumeration values are now represented as strings and not
integers anymore.
[438](https://gitlab.com/datadrivendiscovery/d3m/-/issues/438)
[!348](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/348)
**Could be backwards incompatible.**
* Added CLI to validate primitive descriptions for metalearning database
(`python3 -m d3m primitive validate`).
[!333](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/333)
* Raise an exception during dataset loading if `targets.csv` file does
not combine well with the dataset entry point.
[!330](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/330)

Bugfixes

* CLI now displays correct error messages for invalid arguments to subcommands.
[409](https://gitlab.com/datadrivendiscovery/d3m/-/issues/409)
[!368](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/368)
* Reference runtime does not call `fit` and `produce`
methods in a loop anymore. This mitigates an infinite loop for misbehaving primitives.
[!364](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/364)
* During pipeline execution all Python logging is now recorded in the
pipeline run and it does not depend anymore on logging level otherwise
configured during execution.
[!360](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/360)
* Default sampling code for hyper-parameters now makes sure to return
values in original types and not numpy ones.
[440](https://gitlab.com/datadrivendiscovery/d3m/-/issues/440)
[!352](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/352)
* We now ensure that file handles opened for CLI commands are flushed
so that data is not lost.
[436](https://gitlab.com/datadrivendiscovery/d3m/issues/436)
[!335](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/335)
* Fixed saving exposed produced outputs for `fit-score` CLI command when
scoring failed.
[!341](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/341)
* Made sure `time_granularity` metadata is saved when saving a D3M dataset.
[!340](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/340)
* Changed version of GitPython dependency to 3.1.0 to fix older versions
being broken because of its own unconstrained upper dependency.
[!336](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/336)
* Fixed how paths are constructed when exposing and saving produced values.
[!336](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/336)

Other

* Added guides to the documentation.
[!351](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/351)
[!374](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/374)
* Removed type annotations from docstrings. Python type annotations are now used instead when rendering documentation.
[239](https://gitlab.com/datadrivendiscovery/d3m/-/issues/239)
[!371](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/371)
* Renamed `blacklist` in `d3m.index.load_all` and `primitives_blacklist` in `d3m.metadata.pipeline.Resolver`
to `blocklist` and `primitives_blocklist`, respectively.
**Backwards incompatible.**
* Removed `https://metadata.datadrivendiscovery.org/types/GPUResourcesUseParameter`
semantic type. Added `can_use_gpus` primitive metadata field to signal that
the primitive can use GPUs if available.
[448](https://gitlab.com/datadrivendiscovery/d3m/-/issues/448)
[!369](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/369)
**Backwards incompatible.**
* Clarified that hyper-parameters using `https://metadata.datadrivendiscovery.org/types/CPUResourcesUseParameter`
should have 1 as default value.
[!369](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/369)
* Clarified that it is not necessary to call `fit` before calling
`continue_fit`.
* `index` CLI command has been renamed to `primitive` CLI command.
[437](https://gitlab.com/datadrivendiscovery/d3m/-/issues/437)
[!363](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/363)
* `numpy.matrix` has been removed as an allowed container type, as it
was deprecated by NumPy.
[230](https://gitlab.com/datadrivendiscovery/d3m/-/issues/230)
[!362](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/362)
**Backwards incompatible.**
* CLI has now `--version` command which returns the version of the d3m
core package itself.
[378](https://gitlab.com/datadrivendiscovery/d3m/-/issues/378)
[!359](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/359)
* Upgraded schemas to JSON Schema draft 7, and upgraded Python `jsonschema`
dependency to version 3.
[392](https://gitlab.com/datadrivendiscovery/d3m/-/issues/392)
[!342](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/342)
* Added a Primitive Good Citizen Checklist to documentation, documenting
some best practices when writing a primitive.
[127](https://gitlab.com/datadrivendiscovery/d3m/-/issues/127)
[!347](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/347)
[!355](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/355)
* Updated upper bounds of core dependencies to latest available versions.
[!337](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/337)
* Added to `algorithm_types`:
* `SAMPLE_SELECTION`
* `SAMPLE_MERGING`
* `MOMENTUM_CONTRAST`
* `CAUSAL_ANALYSIS`

[!332](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/332)
[!357](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/357)
[!373](https://gitlab.com/datadrivendiscovery/d3m/-/merge_requests/373)

2020.1.9

Enhancements

* Support for D3M datasets with minimal metadata.
[429](https://gitlab.com/datadrivendiscovery/d3m/issues/429)
[!327](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/327)
* Pipeline runs (and in fact many other input documents) can now be directly used gzipped
in all CLI commands. They have to have filename end with `.gz` for decompression to happen
automatically.
[420](https://gitlab.com/datadrivendiscovery/d3m/issues/420)
[!317](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/317)
* Made problem descriptions again more readable when converted to JSON.
[!316](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/316)
* Improved YAML handling to encourage faster C implementation.
[416](https://gitlab.com/datadrivendiscovery/d3m/issues/416)
[!313](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/313)

Bugfixes

* Fixed the error message if all required CLI arguments are not passed to the runtime.
[411](https://gitlab.com/datadrivendiscovery/d3m/issues/411)
[!319](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/319)
* Removed assumption that all successful pipeline run steps have method calls.
[422](https://gitlab.com/datadrivendiscovery/d3m/issues/422)
[!320](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/320)
* Fixed "Duplicate problem ID" warnings when multiple problem descriptions
have the same problem ID, but in fact they are the same problem description.
No warning is made in this case anymore.
[417](https://gitlab.com/datadrivendiscovery/d3m/issues/417)
[!321](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/321)
* Fixed the use of D3M container types in recent versions of Keras and TensorFlow.
[426](https://gitlab.com/datadrivendiscovery/d3m/issues/426)
[!322](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/322)
* Fixed `validate` CLI commands to work on YAML files.

Other

* Updated upper bounds of core dependencies to latest available versions.
[427](https://gitlab.com/datadrivendiscovery/d3m/issues/427)
[!325](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/325)
* Refactored default pipeline run parser implementation to make it
easier to provide alternative dataset and problem resolvers.
[!314](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/314)
* Moved out local test primitives into [`tests/data` git submodule](https://gitlab.com/datadrivendiscovery/tests-data).
Now all test primitives are in one place.
[254](https://gitlab.com/datadrivendiscovery/d3m/issues/254)
[!312](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/312)

2019.11.10

* Support for version 4.0.0 of D3M dataset schema has been added.
* D3M core package now supports loading directly datasets from OpenML.
* When saving `Dataset` object to D3M dataset format, metadata is now preserved.
* NetworkX objects are not anymore container types and are not allowed
anymore to be passed as values between primitives.
* "Meta" files are not supported anymore by the runtime. Instead save a
pipeline run with configuration of the run you want, and use the pipeline
run to re-run using that configuration.

Enhancements

* Primitive family `REMOTE_SENSING` has been added.
[!310](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/310)
* Added support for version 4.0.0 of D3M dataset schema:
* There are no more `NODE` and `EDGE` references (used in graph datasets),
but only `NODE_ATTRIBUTE` and `EDGE_ATTRIBUTE`.
* `time_granularity` can now be present on a column.
* `forecasting_horizon` can now be present in a problem description.
* `task_type` and `task_subtype` have been merged into `task_keywords`.
As a consequence, Python `TaskType` and `TaskSubtype` were replaced
with `TaskKeyword`.

[401](https://gitlab.com/datadrivendiscovery/d3m/issues/401)
[!310](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/310)
**Backwards incompatible.**

* Added OpenML dataset loader. Now you can pass an URL to a OpenML dataset
and it will be downloaded and converted to a `Dataset` compatible object,
with including many of available meta-features. Combined with support
for saving datasets, this now allows easy conversion between OpenML
datasets and D3M datasets, e.g., `python3 -m d3m dataset convert -i https://www.openml.org/d/61 -o out/datasetDoc.json`.
[252](https://gitlab.com/datadrivendiscovery/d3m/issues/252)
[!309](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/309)
* When saving and loading D3M datasets, metadata is now preserved.
[227](https://gitlab.com/datadrivendiscovery/d3m/issues/227)
[!265](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/265)
* Metadata can now be converted to a JSON compatible structure in a
reversible manner.
[373](https://gitlab.com/datadrivendiscovery/d3m/issues/373)
[!308](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/308)
* Pipeline run now records if a pipeline was run as a standard pipeline
under `run.is_standard_pipeline` field.
[396](https://gitlab.com/datadrivendiscovery/d3m/issues/396)
[!249](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/249)
* "meta" files have been replaced with support for rerunning pipeline runs.
Instead of configuring a "meta" file with configuration how to run a
pipeline, simply provide an example pipeline run which demonstrates how
the pipeline was run. Runtime does not have `--meta` argument anymore,
but has now `--input-run` argument instead.
[202](https://gitlab.com/datadrivendiscovery/d3m/issues/202)
[!249](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/249)
**Backwards incompatible.**
* Changed `LossFunctionMixin` to support multiple loss functions.
[386](https://gitlab.com/datadrivendiscovery/d3m/issues/386)
[!305](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/305)
**Backwards incompatible.**
* Pipeline equality and hashing functions now have `only_control_hyperparams`
argument which can be set to use only control hyper-parameters when doing
comparisons.
[!289](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/289)
* Pipelines and other YAML files are now recognized with both `.yml` and
`.yaml` file extensions.
[375](https://gitlab.com/datadrivendiscovery/d3m/issues/375)
[!302](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/302)
* `F1Metric`, `F1MicroMetric`, and `F1MacroMetric` can now operate on
multiple target columns and average scores for all of them.
[400](https://gitlab.com/datadrivendiscovery/d3m/issues/400)
[!298](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/298)
* Pipelines and pipeline runs can now be serialized with Arrow.
[381](https://gitlab.com/datadrivendiscovery/d3m/issues/381)
[!290](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/290)
* `describe` CLI commands now accept `--output` argument to control where
their output is saved to.
[!279](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/279)

Bugfixes

* Made exposed outputs be stored even in the case of an exception.
[380](https://gitlab.com/datadrivendiscovery/d3m/issues/380)
[!304](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/304)
* Fixed `source.from` metadata in datasets and problem descriptions
and its validation for metalearning database.
[363](https://gitlab.com/datadrivendiscovery/d3m/issues/363)
[!303](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/303)
* Fixed pipeline run references when running the runtime through
evaluation command.
[395](https://gitlab.com/datadrivendiscovery/d3m/issues/395)
[!294](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/294)
* The core package scoring primitive has been updated to have digest.
This allows the core package scoring pipeline to have it as well.
This changes makes it required for the core package to be installed
in editable mode (`pip3 install -e ...`) when being installed from the
git repository.
[!280](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/280)
**Backwards incompatible.**

Other

* Few top-level runtime functions had some of their arguments moved
to keyword-only arguments:
* `fit`: `problem_description`
* `score`: `scoring_pipeline`, `problem_description`, `metrics`, `predictions_random_seed`
* `prepare_data`: `data_pipeline`, `problem_description`, `data_params`
* `evaluate`: `data_pipeline`, `scoring_pipeline`, `problem_description`, `data_params`, `metrics`

[352](https://gitlab.com/datadrivendiscovery/d3m/issues/352)
[!301](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/301)
**Backwards incompatible.**

* `can_accept` method has been removed from primitive interfaces.
[334](https://gitlab.com/datadrivendiscovery/d3m/issues/334)
[!300](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/300)
**Backwards incompatible.**
* NetworkX objects are not anymore container types and are not allowed
anymore to be passed as values between primitives. Dataset loader now
does not convert a GML file to a NetworkX object but represents it
as a files collection resource. A primitive should then convert that
resource into a normalized edge-list graph representation.
[349](https://gitlab.com/datadrivendiscovery/d3m/issues/349)
[!299](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/299)
**Backwards incompatible.**
* `JACCARD_SIMILARITY_SCORE` metric is now a binary metric and requires
`pos_label` parameter.
[!299](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/299)
**Backwards incompatible.**
* Updated core dependencies. Some important packages are now at versions:
* `tensorflow`: 2.0.0
* `keras`: 2.3.1
* `torch`: 1.3.0.post2
* `theano`: 1.0.4
* `scikit-learn`: 0.21.3
* `numpy`: 1.17.3
* `pandas`: 0.25.2
* `networkx`: 2.4
* `pyarrow`: 0.15.1
* `scipy`: 1.3.1

[398](https://gitlab.com/datadrivendiscovery/d3m/issues/398)
[379](https://gitlab.com/datadrivendiscovery/d3m/issues/379)
[!299](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/299)

* Primitive family `DIMENSIONALITY_REDUCTION` has been added.
[!284](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/284)
* Added to `algorithm_types`:
* `POLYNOMIAL_REGRESSION`
* `IMAGENET`
* `RETINANET`

[!306](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/306)

* `--process-dependency-link` is not anymore suggested to be used when
installing primitives.
* `sample_rate` metadata field inside `dimension` has been renamed to
`sampling_rate` to make it consistent across metadata. This field
should contain a sampling rate used for the described dimension,
when values in the dimension are sampled.
**Backwards incompatible.**

2019.6.7

Enhancements

* Dataset loading has been optimized for the case when only one file
type exists in a file collection. Metadata is also simplified in this case.
[314](https://gitlab.com/datadrivendiscovery/d3m/issues/314)
[!277](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/277)
* Support defining unfitted primitives in the pipeline for passing them
to another primitive as a hyper-parameter. Unfitted primitives do not
have any input connected and runtime just creates a primitive instance
but does not fit or produce them. It then passes this primitive instance
to another primitive as a hyper-parameter value.
[!274](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/274)
* When saving datasets, we now use hard-linking of files when possible.
[368](https://gitlab.com/datadrivendiscovery/d3m/issues/368)
[!271](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/271)

Bugfixes

* Specifying `-E` to the `d3m runtime` CLI now exposes really all outputs
of all steps and not just pipeline outputs.
[367](https://gitlab.com/datadrivendiscovery/d3m/issues/367)
[!270](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/270)
* Fixed minor issues when loading sklearn example datasets.
* Fixed PyPi metadata of the package.
[!267](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/267)
* When saving D3M dataset, also structural type information is now used to set
column type.
[339](https://gitlab.com/datadrivendiscovery/d3m/issues/339)
[!255](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/255)
* When saving D3M dataset, update digest of saved dataset to digest of
what has been saved.
[340](https://gitlab.com/datadrivendiscovery/d3m/issues/340)
[!262](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/262)

Other

* Pipeline's `get_exposable_outputs` method has been renamed to `get_producing_outputs`.
[!270](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/270)
* Updating columns from DataFrame returned from `DataFrame.select_columns`
does not raise a warning anymore.
[!268](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/268)
* Added `scipy==1.2.1` as core dependency.
[!266](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/266)
* Added code style guide to the repository.
[!260](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/260)
* Added to `algorithm_types`:

* `ITERATIVE_LABELING`

[!276](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/276)

2019.5.8

* This release contains an implementation of `D3MDatasetSaver` so `Dataset` objects
can now be saved using their `save` method into D3M dataset format.
* Additional hyper-parameters classes have been defined and existing improved.
Probably the most useful addition is `List` hyper-parameter which allows
repeated values with order of values (in contrast with `Set`).
* Standard graph representation has been standardized (a nodelist table and an
edge list table) and related semantic types have been added to mark source
and target columns for edges.
* Standard time-series representation has been standardized (a long format)
and related semantic types have been added to identify columns to index
time-series by.
* Feature construction primitive should mark newly constructed attributes
with `https://metadata.datadrivendiscovery.org/types/ConstructedAttribute`
semantic type.
* There are now mixins available to define primitives which can be used to
describe neural networks as pipelines.
* There is now a single command line interface for the core package under
`python3 -m d3m`.

Enhancements

* Runtime now raises an exception if target columns from problem description
could not be found in provided input datasets.
[281](https://gitlab.com/datadrivendiscovery/d3m/issues/281)
[!155](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/155)
* Core package command line interfaces have been consolidated and revamped
and are now all available under single `python3 -m d3m`.
[338](https://gitlab.com/datadrivendiscovery/d3m/issues/338)
[!193](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/193)
[!233](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/233)
* Added `--expose-produced-outputs` argument runtime CLI to allow saving
to a directory produced outputs of all primitives from pipeline's run.
Useful for debugging.
[206](https://gitlab.com/datadrivendiscovery/d3m/issues/206)
[!223](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/223)
* CSVLoader and SklearnExampleLoader dataset loaders now add
`d3mIndex` column if one does not exist already.
[266](https://gitlab.com/datadrivendiscovery/d3m/issues/266)
[!202](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/202)
* Added `--not-standard-pipeline` argument to `fit`, `produce`, and `fit-produce`
runtime CLI to allow running non-standard pipelines.
[312](https://gitlab.com/datadrivendiscovery/d3m/issues/312)
[!228](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/228)
* Sampling `Bounded` and base `Hyperparameter` hyper-parameter now issues
a warning that sampling of those hyper-parameters is ill-defined.
[!220](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/220)
* `Bounded` hyper-parameter with both bounds now samples from uniform
distribution.
[!220](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/220)
* Added new hyper-parameter classes: `SortedSet`, `List`, and `SortedList`.
[236](https://gitlab.com/datadrivendiscovery/d3m/issues/236)
[292](https://gitlab.com/datadrivendiscovery/d3m/issues/292)
[!219](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/219)
* All bounded hyper-parameter classes now accept additional arguments to
control if bounds are inclusive or exclusive.
[199](https://gitlab.com/datadrivendiscovery/d3m/issues/199)
[!215](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/215)
* `Dataset` objects can now be saved to D3M dataset format by
calling `save` method on them.
[31](https://gitlab.com/datadrivendiscovery/d3m/issues/31)
[344](https://gitlab.com/datadrivendiscovery/d3m/issues/344)
[!96](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/96)
[!217](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/217)

Bugfixes

* Fixed `NormalizeMutualInformationMetric` implementation.
[357](https://gitlab.com/datadrivendiscovery/d3m/issues/357)
[!257](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/257)
* JSON representation of `Union` hyper-parameter values and other
pickled hyper-parameter values has been changed to assure better
interoperability.
[359](https://gitlab.com/datadrivendiscovery/d3m/issues/359)
[!256](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/256)
**Backwards incompatible.**
* All d3m schemas are now fully valid according to JSON schema draft v4.
[79](https://gitlab.com/datadrivendiscovery/d3m/issues/79)
[!233](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/233)
* Fixed an error when saving a fitted pipeline to stdout.
[353](https://gitlab.com/datadrivendiscovery/d3m/issues/353)
[!250](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/250)
* Hyper-parameters cannot use `NaN` and infinity floating-point values
as their bounds. This assures compatibility with JSON.
[324](https://gitlab.com/datadrivendiscovery/d3m/issues/324)
[!237](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/237)
**Backwards incompatible.**
* Pipelines are now exported to JSON in strict compliance of the
JSON specification.
[323](https://gitlab.com/datadrivendiscovery/d3m/issues/323)
[!238](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/238)
* Runtime execution does not fail anymore if predictions cannot be converted
to JSON for pipeline run. A warning is issued instead.
[347](https://gitlab.com/datadrivendiscovery/d3m/issues/347)
[!227](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/227)
* Better support for running reference runtime without exceptions on non-Linux
operating systems.
[246](https://gitlab.com/datadrivendiscovery/d3m/issues/246)
[!218](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/218)
* Strict checking of dataset, pipeline and primitive digests against those provided
in metadata are now correctly controlled using `--strict-digest`/`strict_digest`
arguments.
[346](https://gitlab.com/datadrivendiscovery/d3m/issues/346)
[!213](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/213)
* Fixed error propagation in `evaluate` runtime function, if error
happens during scoring.
[!210](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/210)
* Fixed accessing container DataFrame's `metadata` attribute when
DataFrame also contains a column with the name `metadata`.
[330](https://gitlab.com/datadrivendiscovery/d3m/issues/330)
[!201](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/201)
* Fixed `.meta` file resolving when `--datasets` runtime argument
is not an absolute path.
[!194](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/194)
* Fixed `get_relations_graph` resolving of column names (used in `Denormalize`
common primitive).
[!196](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/196)

Other

* Other validation functions for metalearning documents. This includes
also CLI to validate.
[220](https://gitlab.com/datadrivendiscovery/d3m/issues/220)
[!233](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/233)
* Pipeline run schema now requires scoring dataset inputs to be recorded
if a data preparation pipeline has not been used.
[!243](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/243)
**Backwards incompatible.**
* Core package now provides standard scoring primitive and scoring pipeline
which are used by runtime by default.
[307](https://gitlab.com/datadrivendiscovery/d3m/issues/307)
[!231](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/231)
* Pipeline run can now be generated also for a subset of non-standard
pipelines: those which have all inputs of `Dataset` type.
[!232](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/232)
* Pipeline run now also records a normalized score, if available.
[!230](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/230)
* Pipeline `context` field has been removed from schema and implementation.
[!229](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/229)
* Added `pure_primitive` field to primitive's metadata so that primitives
can mark themselves as not pure (by default all primitives are seen as pure).
[331](https://gitlab.com/datadrivendiscovery/d3m/issues/331)
[!226](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/226)
* `Metadata` methods `to_json_structure` and `to_simple_structure` has been
modified to not return anymore internal metadata representation but
metadata representation equivalent to what you get from `query` call.
To obtain internal representation use `to_internal_json_structure`
and `to_internal_simple_structure`.
[!225](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/225)
**Backwards incompatible.**
* `NeuralNetworkModuleMixin` and `NeuralNetworkObjectMixin` have been
added to primitive interfaces to support representing neural networks
as pipelines.
[174](https://gitlab.com/datadrivendiscovery/d3m/issues/174)
[!87](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/87)
* `get_loss_function` has been renamed to `get_loss_metric` in
`LossFunctionMixin`.
[!87](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/87)
**Backwards incompatible.**
* `UniformInt`, `Uniform`, and `LogUniform` hyper-parameter classes now
subclass `Bounded` class.
[!216](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/216)
* Metrics do not have default parameter values anymore, cleaned legacy
parts of code assuming so.
[!212](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/212)
* Added new semantic types:
* `https://metadata.datadrivendiscovery.org/types/EdgeSource`
* `https://metadata.datadrivendiscovery.org/types/DirectedEdgeSource`
* `https://metadata.datadrivendiscovery.org/types/UndirectedEdgeSource`
* `https://metadata.datadrivendiscovery.org/types/SimpleEdgeSource`
* `https://metadata.datadrivendiscovery.org/types/MultiEdgeSource`
* `https://metadata.datadrivendiscovery.org/types/EdgeTarget`
* `https://metadata.datadrivendiscovery.org/types/DirectedEdgeTarget`
* `https://metadata.datadrivendiscovery.org/types/UndirectedEdgeTarget`
* `https://metadata.datadrivendiscovery.org/types/SimpleEdgeTarget`
* `https://metadata.datadrivendiscovery.org/types/MultiEdgeTarget`
* `https://metadata.datadrivendiscovery.org/types/ConstructedAttribute`
* `https://metadata.datadrivendiscovery.org/types/SuggestedGroupingKey`
* `https://metadata.datadrivendiscovery.org/types/GroupingKey`

[134](https://gitlab.com/datadrivendiscovery/d3m/issues/134)
[348](https://gitlab.com/datadrivendiscovery/d3m/issues/348)
[!211](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/211)
[!214](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/214)

* Updated core dependencies. Some important packages are now at versions:
* `scikit-learn`: 0.20.3
* `pyarrow`: 0.13.0

[!206](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/206)

* Clarified in primitive interface documentation that if primitive should have been
fitted before calling its produce method, but it has not been, primitive should
raise a ``PrimitiveNotFittedError`` exception.
[!204](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/204)
* Added to `algorithm_types`:

* `EQUI_JOIN`
* `DATA_RETRIEVAL`
* `DATA_MAPPING`
* `MAP`
* `INFORMATION_THEORETIC_METAFEATURE_EXTRACTION`
* `LANDMARKING_METAFEATURE_EXTRACTION`
* `MODEL_BASED_METAFEATURE_EXTRACTION`
* `STATISTICAL_METAFEATURE_EXTRACTION`
* `VECTORIZATION`
* `BERT`

[!160](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/160)
[!186](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/186)
[!224](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/224)
[!247](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/247)

* Primitive family `METAFEATURE_EXTRACTION` has been renamed to `METALEARNING`.
[!160](https://gitlab.com/datadrivendiscovery/d3m/merge_requests/160)
**Backwards incompatible.**

Page 1 of 3

Releases

Has known vulnerabilities

Tamu-d3m

Page 1 of 3

2020.11.3

2020.5.18

2020.1.9

2019.11.10

2019.6.7

2019.5.8

Page 1 of 3

Links

Releases