Tamu-d3m

Latest version: v2022.5.23

Safety actively analyzes 702019 Python packages for vulnerabilities to keep your Python projects secure.

Page 3 of 3

2018.4.18

* Added `pipeline.json` JSON schema to this package. Made `problem.json` JSON schema
describing parsed problem description's schema. There is also a `d3m.metadata.pipeline`
parser for pipelines in this schema and Python object to represent a pipeline.
[53](https://gitlab.com/datadrivendiscovery/d3m/issues/53)
* Updated README to make it explicit that for tabular data the first dimension
is always rows and the second always columns, even in the case of a DataFrame
container type.
[54](https://gitlab.com/datadrivendiscovery/d3m/issues/54)
* Made `Dataset` container type return Pandas `DataFrame` instead of numpy `ndarray`
and in generaly suggest to use Pandas `DataFrame` as a default container type.
**Backwards incompatible.**
[49](https://gitlab.com/datadrivendiscovery/d3m/issues/49)
* Added `UniformBool` hyper-parameter class.
* Renamed `FeaturizationPrimitiveBase` to `FeaturizationLearnerPrimitiveBase`.
**Backwards incompatible.**
* Defined `ClusteringTransformerPrimitiveBase` and renamed `ClusteringPrimitiveBase`
to `ClusteringLearnerPrimitiveBase`.
**Backwards incompatible.**
[20](https://gitlab.com/datadrivendiscovery/d3m/issues/20)
* Added `inputs_across_samples` decorator to mark which method arguments
are inputs which compute across samples.
[19](https://gitlab.com/datadrivendiscovery/d3m/issues/19)
* Converted `SingletonOutputMixin` to a `singleton` decorator. This allows
each produce method separately to be marked as a singleton produce method.
**Backwards incompatible.**
[17](https://gitlab.com/datadrivendiscovery/d3m/issues/17)
* `can_accept` can also raise an exception with information why it cannot accept.
[13](https://gitlab.com/datadrivendiscovery/d3m/issues/13)
* Added `Primitive` hyper-parameter to describe a primitive or primitives.
Additionally, documented in docstrings better how to define hyper-parameters which
use primitives for their values and how should such primitives-as-values be passed
to primitives as their hyper-parameters.
[51](https://gitlab.com/datadrivendiscovery/d3m/issues/51)
* Hyper-parameter values can now be converted to and from JSON-compatible structure
using `values_to_json` and `values_from_json` methods. Non-primitive values
are pickled and stored as base64 strings.
[67](https://gitlab.com/datadrivendiscovery/d3m/issues/67)
* Added `Choice` hyper-parameter which allows one to define
combination of hyper-parameters which should exists together.
[28](https://gitlab.com/datadrivendiscovery/d3m/issues/28)
* Added `Set` hyper-parameter which samples multiple times another hyper-parameter.
[52](https://gitlab.com/datadrivendiscovery/d3m/issues/52)
* Added `https://metadata.datadrivendiscovery.org/types/MetafeatureParameter`
semantic type for hyper-parameters which control which meta-features are
computed by the primitive.
[41](https://gitlab.com/datadrivendiscovery/d3m/issues/41)
* Added `supported_media_types` primitive metadata to describe
which media types a primitive knows how to manipulate.
[68](https://gitlab.com/datadrivendiscovery/d3m/issues/68)
* Renamed metadata property `mime_types` to `media_types`.
**Backwards incompatible.**
* Made pyarrow dependency a package extra. You can depend on it using
`d3m[arrow]`.
[66](https://gitlab.com/datadrivendiscovery/d3m/issues/66)
* Added `multi_produce` method to primitive interface which allows primitives
to optimize calls to multiple produce methods they might have.
[21](https://gitlab.com/datadrivendiscovery/d3m/issues/21)
* Added `d3m.utils.redirect_to_logging` context manager which can help
redirect primitive's output to stdout and stderr to primitive's logger.
[65](https://gitlab.com/datadrivendiscovery/d3m/issues/65)
* Primitives can now have a dependency on static files and directories.
One can use `FILE` and `TGZ` entries in primitive's `installation`
metadata to ask the caller to provide paths those files and/or
extracted directories through new `volumes` constructor argument.
[18](https://gitlab.com/datadrivendiscovery/d3m/issues/18)
* Core dependencies have been upgraded: `numpy==1.14.2`, `networkx==2.1`.
* LUPI quality in D3M datasets is now parsed into
`https://metadata.datadrivendiscovery.org/types/SuggestedPrivilegedData`
semantic type for a column.
[61](https://gitlab.com/datadrivendiscovery/d3m/issues/61)
* Support for primitives using Docker containers has been put on hold.
We are keeping a way to pass information about running containers to a
primitive and defining dependent Docker images in metadata, but currently
it is not expected that any runtime running primitives will run
Docker containers for a primitive.
[18](https://gitlab.com/datadrivendiscovery/d3m/issues/18)
* Primitives do not have to define all constructor arguments anymore.
This allows them to ignore arguments they do not use, e.g.,
`docker_containers`.
On the other side, when creating an instance of a primitive, one
has now to check which arguments the constructor accepts, which is
available in primitive's metadata:
`primitive.metadata.query()['primitive_code'].get('instance_methods', {})['__init__']['arguments']`.
[63](https://gitlab.com/datadrivendiscovery/d3m/issues/63)
* Information about running primitive's Docker container has changed
from just its address to a `DockerContainer` tuple containing both
the address and a map of all exposed ports.
At the same time, support for Docker has been put on hold so you
do not really have to upgrade for this change anything and can simply
remove the `docker_containers` argument from primitive's constructor.
**Backwards incompatible.**
[14](https://gitlab.com/datadrivendiscovery/d3m/issues/14)
* Multiple exception classes have been defined in `d3m.exceptions`
module and are now in use. This allows easier and more precise
handling of exceptions.
[12](https://gitlab.com/datadrivendiscovery/d3m/issues/12)
* Fixed inheritance of `Hyperparams` class.
[44](https://gitlab.com/datadrivendiscovery/d3m/issues/44)
* Each primitive's class now automatically gets an instance of
[Python's logging](https://docs.python.org/3/library/logging.html)
logger stored into its ``logger`` class attribute. The instance is made
under the name of primitive's ``python_path`` metadata value. Primitives
can use this logger to log information at various levels (debug, warning,
error) and even associate extra data with log record using the ``extra``
argument to the logger calls.
[10](https://gitlab.com/datadrivendiscovery/d3m/issues/10)
* Made sure container data types can be serialized with Arrow/Plasma
while retaining their metadata.
[29](https://gitlab.com/datadrivendiscovery/d3m/issues/29)
* `Scores` in `GradientCompositionalityMixin` replaced with `Gradients`.
`Scores` only makes sense in a probabilistic context.
* Renamed `TIMESERIES_CLASSIFICATION`, `TIMESERIES_FORECASTING`, and
`TIMESERIES_SEGMENTATION` primitives families to
`TIME_SERIES_CLASSIFICATION`, `TIME_SERIES_FORECASTING`, and
`TIME_SERIES_SEGMENTATION`, respectively, to match naming
pattern used elsewhere.
Similarly, renamed `UNIFORM_TIMESERIES_SEGMENTATION` algorithm type
to `UNIFORM_TIME_SERIES_SEGMENTATION`.
Compound words using hyphens are separated, but hyphens for prefixes
are not separated. So "Time-series" and "Root-mean-squared error"
become `TIME_SERIES` and `ROOT_MEAN_SQUARED_ERROR`
but "Non-overlapping" and "Multi-class" are `NONOVERLAPPING` and `MULTICLASS`.
**Backwards incompatible.**
* Updated performance metrics to include `PRECISION_AT_TOP_K` metric.
* Added to problem description parsing support for additional metric
parameters and updated performance metric functions to use them.
[42](https://gitlab.com/datadrivendiscovery/d3m/issues/42)
* Merged `d3m_metadata`, `primitive_interfaces` and `d3m` repositories
into `d3m` repository. This requires the following changes of
imports in existing code:
* `d3m_metadata` to `d3m.metadata`
* `primitive_interfaces` to `d3m.primitive_interfaces`
* `d3m_metadata.container` to `d3m.container`
* `d3m_metadata.metadata` to `d3m.metadata.base`
* `d3m_metadata.metadata.utils` to `d3m.utils`
* `d3m_metadata.metadata.types` to `d3m.types`

**Backwards incompatible.**
[11](https://gitlab.com/datadrivendiscovery/d3m/issues/11)

* Fixed computation of sampled values for `LogUniform` hyper-parameter class.
[47](https://gitlab.com/datadrivendiscovery/d3m/issues/47)
* When copying or slicing container values, metadata is now copied over
instead of cleared. This makes it easier to propagate metadata.
This also means one should make sure to update the metadata in the
new container value to reflect changes to the value itself.
**Could be backwards incompatible.**
* `DataMetadata` now has `set_for_value` method to make a copy of
metadata and set new `for_value` value. You can use this when you
made a new value and you want to copy over metadata, but you also
want this value to be associated with metadata. This is done by
default for container values.
* Metadata now includes SHA256 digest for primitives and datasets.
It is computed automatically during loading. This should allow one to
track exact version of primitive and datasets used.
`d3m.container.dataset.get_d3m_dataset_digest` is a reference
implementation of computing digest for D3M datasets.
You can set `compute_digest` to `False` to disable this.
You can set `strict_digest` to `True` to raise an exception instead
of a warning if computed digest does not match one in metadata.
* Datasets can be now loaded in "lazy" mode: only metadata is loaded
when creating a `Dataset` object. You can use `is_lazy` method to
check if dataset iz lazy and data has not yet been loaded. You can use
`load_lazy` to load data for a lazy object, making it non-lazy.
* There is now an utility metaclass `d3m.metadata.utils.AbstractMetaclass`
which makes classes which use it automatically inherit docstrings
for methods from the parent. Primitive base class and some other D3M
classes are now using it.
* `d3m.metadata.base.CONTAINER_SCHEMA_VERSION` and
`d3m.metadata.base.DATA_SCHEMA_VERSION` were fixed to point to the
correct URI.
* Many `data_metafeatures` properties in metadata schema had type
`numeric` which does not exist in JSON schema. They were fixed to
`number`.
* Added to a list of known semantic types:
`https://metadata.datadrivendiscovery.org/types/Target`,
`https://metadata.datadrivendiscovery.org/types/PredictedTarget`,
`https://metadata.datadrivendiscovery.org/types/TrueTarget`,
`https://metadata.datadrivendiscovery.org/types/Score`,
`https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint`,
`https://metadata.datadrivendiscovery.org/types/SuggestedPrivilegedData`,
`https://metadata.datadrivendiscovery.org/types/PrivilegedData`.
* Added to `algorithm_types`: `ARRAY_CONCATENATION`, `ARRAY_SLICING`,
`ROBUST_PRINCIPAL_COMPONENT_ANALYSIS`, `SUBSPACE_CLUSTERING`,
`SPECTRAL_CLUSTERING`, `RELATIONAL_ALGEBRA`, `MULTICLASS_CLASSIFICATION`,
`MULTILABEL_CLASSIFICATION`, `OVERLAPPING_CLUSTERING`, `SOFT_CLUSTERING`,
`STRICT_PARTITIONING_CLUSTERING`, `STRICT_PARTITIONING_CLUSTERING_WITH_OUTLIERS`,
`UNIVARIATE_REGRESSION`, `NONOVERLAPPING_COMMUNITY_DETECTION`,
`OVERLAPPING_COMMUNITY_DETECTION`.

2018.1.26

* Test primitives updated to have `location_uris` metadata.
* Test primitives updated to have `egg=` package URI suffix in metadata.
* Primitives (instances of their classes) can now be directly pickled
and unpickled. Internally it uses `get_params` and `set_params` in
default implementation. If you need to preserve additional state consider
extending `__getstate__` and `__setstate__` methods.
* Added `RandomPrimitive` test primitive.
* Bumped `numpy` dependency to `1.14` and `pandas` to `0.22`.
* Added `https://metadata.datadrivendiscovery.org/types/ResourcesUseParameter` as a known URI
for `semantic_types` to help convey which hyper-parameters control the use of resources by the
primitive.
[41](https://gitlab.com/datadrivendiscovery/metadata/issues/41)
* Fixed use of `numpy` values in `Params` and `Hyperparams`.
[39](https://gitlab.com/datadrivendiscovery/metadata/issues/39)
* Added `upper_inclusive` argument to `UniformInt`, `Uniform`, and `LogUniform` classes
to signal that the upper bound is inclusive (default is exclusive).
[38](https://gitlab.com/datadrivendiscovery/metadata/issues/38)
* Made `semantic_types` and `description` keyword-only arguments in hyper-parameter description classes.
* Made all enumeration metadata classes have their instances be equal to their string names.
* Made sure `Hyperparams` subclasses can be pickled and unpickled.
* Improved error messages during metadata validation.
* Documented common metadata for primitives and data in the README.
* Added standard deviation to aggregate metadata values possible.
* Added `NO_JAGGED_VALUES` to `preconditions` and `effects`.
* Added to `algorithm_types`: `AGGREGATE_FUNCTION`, `AUDIO_STREAM_MANIPULATION`, `BACKWARD_DIFFERENCE_CODING`,
`BAYESIAN_LINEAR_REGRESSION`, `CATEGORY_ENCODER`, `CROSS_VALIDATION`, `DISCRETIZATION`, `ENCODE_BINARY`,
`ENCODE_ORDINAL`, `FEATURE_SCALING`, `FORWARD_DIFFERENCE_CODING`, `FREQUENCY_TRANSFORM`, `GAUSSIAN_PROCESS`,
`HASHING`, `HELMERT_CODING`, `HOLDOUT`, `K_FOLD`, `LEAVE_ONE_OUT`, `MERSENNE_TWISTER`, `ORTHOGONAL_POLYNOMIAL_CODING`,
`PASSIVE_AGGRESSIVE`, `PROBABILISTIC_DATA_CLEANING`, `QUADRATIC_DISCRIMINANT_ANALYSIS`, `RECEIVER_OPERATING_CHARACTERISTIC`,
`RELATIONAL_DATA_MINING`, `REVERSE_HELMERT_CODING`, `SEMIDEFINITE_EMBEDDING`, `SIGNAL_ENERGY`, `SOFTMAX_FUNCTION`,
`SPRUCE`, `STOCHASTIC_GRADIENT_DESCENT`, `SUM_CODING`, `TRUNCATED_NORMAL_DISTRIBUTION`, `UNIFORM_DISTRIBUTION`.
* Added to `primitive_family`: `DATA_GENERATION`, `DATA_VALIDATION`, `DATA_WRANGLING`, `VIDEO_PROCESSING`.
* Added `NoneType` to the list of data types allowed inside container types.
* For `PIP` dependencies specified by a `package_uri` git URI, an `egg=package_name` URI suffix is
now required.

2018.1.5

* Made use of the PyPI package official. Documented a requirement for
`--process-dependency-links` argument during installation.
[27](https://gitlab.com/datadrivendiscovery/metadata/issues/27)
* Arguments `learning_rate` and `weight_decay` in `GradientCompositionalityMixin` renamed to
`fine_tune_learning_rate` and `fine_tune_weight_decay`, respectively.
`learning_rate` is a common hyper-parameter name.
[41](https://gitlab.com/datadrivendiscovery/primitive-interfaces/issues/41)
* Added `https://metadata.datadrivendiscovery.org/types/TuningParameter` and
`https://metadata.datadrivendiscovery.org/types/ControlParameter` as two known URIs for
`semantic_types` to help convey which hyper-parameters are true tuning parameters (should be
tuned during hyper-parameter optimization phase) and which are control parameters (should be
determined during pipeline construction phase and are part of the logic of the pipeline).
* Made `installation` metadata optional. This allows local-only primitives.
You can still register them into D3M namespace using `d3m.index.register_primitive`.
* Fixed serialization to JSON of hyper-parameters with `q` argument.
* Clarified that primitive's `PIP` dependency `package` has to be installed with `--process-dependency-link` argument
enabled, and `package_uri` with both `--process-dependency-link` and `--editable`, so that primitives can have access
to their git history to generate metadata.
* Only `git+http` and `git+https` URI schemes are allowed for git repository URIs for `package_uri`.
* Added to `algorithm_types`: `AUDIO_MIXING`, `CANONICAL_CORRELATION_ANALYSIS`, `DATA_PROFILING`, `DEEP_FEATURE_SYNTHESIS`,
`INFORMATION_ENTROPY`, `MFCC_FEATURE_EXTRACTION`, `MULTINOMIAL_NAIVE_BAYES`, `MUTUAL_INFORMATION`, `PARAMETRIC_TRAJECTORY_MODELING`,
`SIGNAL_DITHERING`, `SIGNAL_TO_NOISE_RATIO`, `STATISTICAL_MOMENT_ANALYSIS`, `UNIFORM_TIMESERIES_SEGMENTATION`.
* Added to `primitive_family`: `SIMILARITY_MODELING`, `TIMESERIES_CLASSIFICATION`, `TIMESERIES_SEGMENTATION`.

2017.12.27

* Documented `produce` method for `ClusteringPrimitiveBase` and added
`ClusteringDistanceMatrixMixin`.
[18](https://gitlab.com/datadrivendiscovery/primitive-interfaces/issues/18)
* Added `can_accept` class method to primitive base class and implemented its
default implementation.
[20](https://gitlab.com/datadrivendiscovery/primitive-interfaces/issues/20)
* "Distance" primitives now accept an extra argument instead of a tuple.
* `Params` should now be a subclass of `d3m.metadata.params.Params`, which is a
specialized dict instead of a named tuple.
* Removed `Graph` class. There is no need for it anymore because we can identify
them by having input type a NetworkX graph and through metadata discovery.
* Added `timeout` and `iterations` arguments to more methods.
* Added `forward` and `backward` backprop methods to `GradientCompositionalityMixin`
to allow end-to-end backpropagation across diverse primitives.
[26](https://gitlab.com/datadrivendiscovery/primitive-interfaces/issues/26)
* Added `log_likelihoods` method to `ProbabilisticCompositionalityMixin`.
* Constructor now accepts `docker_containers` argument with addresses of running
primitive's Docker containers.
[25](https://gitlab.com/datadrivendiscovery/primitive-interfaces/issues/25)
* Removed `CallMetadata` and `get_call_metadata` and changed so that some methods
directly return new but similar `CallResult`.
[27](https://gitlab.com/datadrivendiscovery/primitive-interfaces/issues/27)
* Documented how extra arguments to standard and extra methods can be defined.
* Documented that all arguments with the same name in all methods should have the
same type. Arguments are per primitive not per method.
[29](https://gitlab.com/datadrivendiscovery/primitive-interfaces/issues/29)
* Specified how to define extra "produce" methods which have same semantics
as `produce` but different output types.
[30](https://gitlab.com/datadrivendiscovery/primitive-interfaces/issues/30)
* Added `SingletonOutputMixin` to signal that primitive's output contains
only one element.
[15](https://gitlab.com/datadrivendiscovery/primitive-interfaces/issues/15)
* Added `get_loss_primitive` to allow accessing to the loss primitive
being used.
* Moved `set_training_data` back to the base class.
This breaks Liskov substitution principle.
[19](https://gitlab.com/datadrivendiscovery/primitive-interfaces/issues/19)
* Renamed `__metadata__` to `metadata` attribute.
[23](https://gitlab.com/datadrivendiscovery/primitive-interfaces/issues/23)
* `set_random_seed` method has been removed and replaced with a
`random_seed` argument to the constructor, which is also exposed as an attribute.
[16](https://gitlab.com/datadrivendiscovery/primitive-interfaces/issues/16)
* Primitives have now `hyperparams` attribute which returns a
hyper-parameters object passed to the constructor.
[14](https://gitlab.com/datadrivendiscovery/primitive-interfaces/issues/14)
* `Params` and `Hyperparams` are now required to be pickable and copyable.
[3](https://gitlab.com/datadrivendiscovery/primitive-interfaces/issues/3)
* Primitives are now parametrized by `Hyperparams` type variable as well.
Constructor now receives hyper-parameters as an instance as one argument
instead of multiple keyword arguments.
[13](https://gitlab.com/datadrivendiscovery/primitive-interfaces/issues/13)
* `LossFunctionMixin`'s `get_loss_function` method now returns a value from
problem schema `Metric` enumeration.
* `LossFunctionMixin` has now a `loss` and `losses` methods which allows one
to ask a primitive to compute loss for a given set of inputs and outputs using
internal loss function the primitive is using.
[17](https://gitlab.com/datadrivendiscovery/primitive-interfaces/issues/17)
* Added `Params` class.
* Removed `Graph` class in favor of NetworkX `Graph` class.
* Added `Metadata` class with subclasses and documented the use of selectors.
* Added `Hyperparams` class.
* Added `Dataset` class.
* "Sequences" have generally been renamed to "containers". Related code is also now under
`d3m.container` and not under `d3m.metadata.sequence` anymore.
* `__metadata__` attribute was renamed to `metadata`.
* Package renamed from `d3m_types` to `d3m_metadata`.
* Added schemas for metadata contexts.
* A problem schema parsing and Python enumerations added in
`d3m.metadata.problem` module.
* A standard set of container and base types have been defined.
* `d3m.index` command tool rewritten to support three commands: `search`, `discover`,
and `describe`. See details by running `python -m d3m.index -h`.
* Package now requires Python 3.6.
* Repository migrated to gitlab.com and made public.

2017.10.10

* Made `d3m.index` module with API to register primitives into a `d3m.primitives` module
and searches over it.
* `d3m.index` is also a command-line tool to list available primitives and automatically
generate JSON annotations for primitives.
* Created `d3m.primitives` module which automatically populates itself with primitives
using Python entry points.

How to release a new version

*A cheat sheet.*

* On `devel` branch:
* `git pull` to make sure everything is in sync with remote origin.
* Change a version in `d3m/__init__.py` to the new version, e.g., `2019.2.12`.
* Change `vNEXT` in `HISTORY.md` to the to-be-released version, with `v` prefix.
* Commit with message `Bumping version for release.`
* `git push`
* Wait for CI to run tests successfully.
* On `master` branch:
* `git pull` to make sure everything is in sync with remote origin.
* Merge `devel` into `master` branch: `git merge devel`
* `git push`
* Wait for CI to run tests successfully.
* Release a package to PyPi:
* `rm -rf dist/`
* `python setup.py sdist`
* `twine upload dist/*`
* Tag with version prefixed with `v`, e.g., for version `2017.9.20`: `git tag v2017.9.20`
* `git push` & `git push --tags`
* On `devel` branch:
* `git merge master` to make sure `devel` is always on top of `master`.
* Change a version in `d3m/__init__.py` to tomorrow and append `.dev0`, e.g. `2019.2.13.dev0`.
* Add a new empty `vNEXT` version on top of `HISTORY.md`.
* Commit with message `Version bump for development.`
* `git push`
* After a release:
* Update [`libs` and `libs-lite` Docker images](https://gitlab.com/datadrivendiscovery/images) with the new release.
* Update [primitives index repository](https://gitlab.com/datadrivendiscovery/primitives/blob/master/HOW_TO_MANAGE.md) with the new release.

If there is a need for a patch version to fix a released version on the same day,
use `.postX` prefix, like `2017.9.20.post0`. If more than a day has passed, just
use the new day's version.

Page 3 of 3

Releases

Has known vulnerabilities

Tamu-d3m

Page 3 of 3

2018.4.18

2018.1.26

2018.1.5

2017.12.27

2017.10.10

Page 3 of 3

Links

Releases