Datumaro

Latest version: v1.9.1

Safety actively analyzes 710206 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 5 of 7

0.2.2

New features
- Video reading API
(<https://github.com/openvinotoolkit/datumaro/pull/521>)
- Python API documentation
(<https://github.com/openvinotoolkit/datumaro/pull/526>)
- Mapillary Vistas dataset format (Import-only)
(<https://github.com/openvinotoolkit/datumaro/pull/537>)
- Datumaro can now be installed on Windows on Python 3.9
(<https://github.com/openvinotoolkit/datumaro/pull/547>)
- Import for SYNTHIA dataset format
(<https://github.com/openvinotoolkit/datumaro/pull/532>)
- Support of `score` attribute in KITTI detetion
(<https://github.com/openvinotoolkit/datumaro/pull/571>)
- Support for Accuracy Checker dataset meta files in formats
(<https://github.com/openvinotoolkit/datumaro/pull/553>,
<https://github.com/openvinotoolkit/datumaro/pull/569>,
<https://github.com/openvinotoolkit/datumaro/pull/575>)
- Import for VoTT dataset format
(<https://github.com/openvinotoolkit/datumaro/pull/573>)
- Image resizing transform
(<https://github.com/openvinotoolkit/datumaro/pull/581>)

Enhancements
- The following formats can now be detected unambiguously:
`ade20k2017`, `ade20k2020`, `camvid`, `coco`, `cvat`, `datumaro`,
`icdar_text_localization`, `icdar_text_segmentation`,
`icdar_word_recognition`, `imagenet_txt`, `kitti_raw`, `label_me`, `lfw`,
`mot_seq`, `open_images`, `vgg_face2`, `voc`, `widerface`, `yolo`
(<https://github.com/openvinotoolkit/datumaro/pull/531>,
<https://github.com/openvinotoolkit/datumaro/pull/536>,
<https://github.com/openvinotoolkit/datumaro/pull/550>,
<https://github.com/openvinotoolkit/datumaro/pull/557>,
<https://github.com/openvinotoolkit/datumaro/pull/558>)
- Allowed Pytest-native tests
(<https://github.com/openvinotoolkit/datumaro/pull/563>)
- Allowed export options in the `datum merge` command
(<https://github.com/openvinotoolkit/datumaro/pull/545>)

Deprecated
- Using `Image`, `ByteImage` from `datumaro.util.image` - these classes
are moved to `datumaro.components.media`
(<https://github.com/openvinotoolkit/datumaro/pull/538>)

Removed
- Equality comparison support between `datumaro.components.media.Image`
and `numpy.ndarray`
(<https://github.com/openvinotoolkit/datumaro/pull/568>)

Bug fixes
- Bug 560: import issue with MOT dataset when using seqinfo.ini file
(<https://github.com/openvinotoolkit/datumaro/pull/564>)
- Empty lines in VOC subset lists are not ignored
(<https://github.com/openvinotoolkit/datumaro/pull/587>)

Security
- TBD

0.2.1

New features
- Import for CelebA dataset format.
(<https://github.com/openvinotoolkit/datumaro/pull/484>)

Enhancements
- File `people.txt` became optional in LFW
(<https://github.com/openvinotoolkit/datumaro/pull/509>)
- File `image_ids_and_rotation.csv` became optional Open Images
(<https://github.com/openvinotoolkit/datumaro/pull/509>)
- Allowed underscores (`_`) in subset names in COCO
(<https://github.com/openvinotoolkit/datumaro/pull/509>)
- Allowed annotation files with arbitrary names in COCO
(<https://github.com/openvinotoolkit/datumaro/pull/509>)
- The `icdar_text_localization` format is no longer detected in every directory
(<https://github.com/openvinotoolkit/datumaro/pull/531>)
- Updated `pycocotools` version to 2.0.2
(<https://github.com/openvinotoolkit/datumaro/pull/534>)

Deprecated
- TBD

Removed
- TBD

Bug fixes
- Unhandled exception when a file is specified as the source for a COCO or
MOTS dataset
(<https://github.com/openvinotoolkit/datumaro/pull/530>)
- Exporting dataset without `color` attribute into the
`icdar_text_segmentation` format
(<https://github.com/openvinotoolkit/datumaro/pull/556>)
Security
- TBD

0.2

New features
- A new installation target: `pip install datumaro[default]`, which should
be used by default. The simple `datumaro` is supposed for library users.
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
- Dataset and project versioning capabilities (Git-like)
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
- "dataset revpath" concept in CLI, allowing to pass a dataset path with
the dataset format in `diff`, `merge`, `explain` and `info` CLI commands
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
- `import`, `remove`, `commit`, `checkout`, `log`, `status`, `info` CLI commands
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
- `Coco*Extractor` classes now have an option to preserve label IDs from the
original annotation file
(<https://github.com/openvinotoolkit/datumaro/pull/453>)
- `patch` CLI command to patch datasets
(<https://github.com/openvinotoolkit/datumaro/pull/401>)
- `ProjectLabels` transform to change dataset labels for merging etc.
(<https://github.com/openvinotoolkit/datumaro/pull/401>,
<https://github.com/openvinotoolkit/datumaro/pull/478>)
- Support for custom labels in the KITTI detection format
(<https://github.com/openvinotoolkit/datumaro/pull/481>)
- Type annotations and docs for Annotation classes
(<https://github.com/openvinotoolkit/datumaro/pull/493>)
- Options to control label loading behavior in `imagenet_txt` import
(<https://github.com/openvinotoolkit/datumaro/pull/434>,
<https://github.com/openvinotoolkit/datumaro/pull/489>)

Enhancements
- A project can contain and manage multiple datasets instead of a single one.
CLI operations can be applied to the whole project, or to separate datasets.
Datasets are modified inplace, by default
(<https://github.com/openvinotoolkit/datumaro/issues/328>)
- CLI help for builtin plugins doesn't require project
(<https://github.com/openvinotoolkit/datumaro/issues/328>)
- Annotation-related classes were moved into a new module,
`datumaro.components.annotation`
(<https://github.com/openvinotoolkit/datumaro/pull/439>)
- Rollback utilities replaced with Scope utilities
(<https://github.com/openvinotoolkit/datumaro/pull/444>)
- The `Project` class from `datumaro.components` is changed completely
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
- `diff` and `ediff` are joined into a single `diff` CLI command
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
- Projects use new file layout, incompatible with old projects.
An old project can be updated with `datum project migrate`
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
- Inheriting `CliPlugin` is not required in plugin classes
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
- `Importer`s do not create `Project`s anymore and just return a list of
extractor configurations
(<https://github.com/openvinotoolkit/datumaro/pull/238>)

Deprecated
- TBD

Removed
- `import`, `project merge` CLI commands
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
- Support for project hierarchies. A project cannot be a source anymore
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
- Project cannot have independent internal dataset anymore. All the project
data must be stored in the project data sources
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
- `datumaro_project` format
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
- Unused `path` field of `DatasetItem`
(<https://github.com/openvinotoolkit/datumaro/pull/455>)

Bug fixes
- Deprecation warning in `open_images_format.py`
(<https://github.com/openvinotoolkit/datumaro/pull/440>)
- `lazy_image` returning unrelated data sometimes
(<https://github.com/openvinotoolkit/datumaro/issues/409>)
- Invalid call to `pycocotools.mask.iou`
(<https://github.com/openvinotoolkit/datumaro/pull/450>)
- Importing of Open Images datasets without image data
(<https://github.com/openvinotoolkit/datumaro/pull/463>)
- Return value type in `Dataset.is_modified`
(<https://github.com/openvinotoolkit/datumaro/pull/401>)
- Remapping of secondary categories in `RemapLabels`
(<https://github.com/openvinotoolkit/datumaro/pull/401>)
- VOC dataset patching for classification and segmentation tasks
(<https://github.com/openvinotoolkit/datumaro/pull/478>)
- Exported mask label ids in KITTI segmentation
(<https://github.com/openvinotoolkit/datumaro/pull/481>)
- Missing `label` for `Points` read in the LFW format
(<https://github.com/openvinotoolkit/datumaro/pull/494>)

Security
- TBD

0.1.11

New features
- The Open Images format now supports bounding box
and segmentation mask annotations
(<https://github.com/openvinotoolkit/datumaro/pull/352>,
<https://github.com/openvinotoolkit/datumaro/pull/388>).
- Bounding boxes values decrement transform (<https://github.com/openvinotoolkit/datumaro/pull/366>)
- Improved error reporting in `Dataset` (<https://github.com/openvinotoolkit/datumaro/pull/386>)
- Support ADE20K format (import only) (<https://github.com/openvinotoolkit/datumaro/pull/400>)
- Documentation website at <https://openvinotoolkit.github.io/datumaro> (<https://github.com/openvinotoolkit/datumaro/pull/420>)

Enhancements
- Datumaro no longer depends on scikit-image
(<https://github.com/openvinotoolkit/datumaro/pull/379>)
- `Dataset` remembers export options on saving / exporting for the first time (<https://github.com/openvinotoolkit/datumaro/pull/386>)

Deprecated
- TBD

Removed
- TBD

Bug fixes
- Application of `remap_labels` to dataset categories of different length (<https://github.com/openvinotoolkit/datumaro/issues/314>)
- Patching of datasets in formats (<https://github.com/openvinotoolkit/datumaro/issues/348>)
- Improved Cityscapes export performance (<https://github.com/openvinotoolkit/datumaro/pull/367>)
- Incorrect format of `*_labelIds.png` in Cityscapes export (<https://github.com/openvinotoolkit/datumaro/issues/325>, <https://github.com/openvinotoolkit/datumaro/issues/342>)
- Item id in ImageNet format (<https://github.com/openvinotoolkit/datumaro/pull/371>)
- Double quotes for ICDAR Word Recognition (<https://github.com/openvinotoolkit/datumaro/pull/375>)
- Wrong display of builtin formats in CLI (<https://github.com/openvinotoolkit/datumaro/issues/332>)
- Non utf-8 encoding of annotation files in Market-1501 export (<https://github.com/openvinotoolkit/datumaro/pull/392>)
- Import of ICDAR, PASCAL VOC and VGGFace2 images from subdirectories on WIndows
(<https://github.com/openvinotoolkit/datumaro/pull/392>)
- Saving of images with Unicode paths on Windows (<https://github.com/openvinotoolkit/datumaro/pull/392>)
- Calling `ProjectDataset.transform()` with a string argument (<https://github.com/openvinotoolkit/datumaro/issues/402>)
- Attributes casting for CVAT format (<https://github.com/openvinotoolkit/datumaro/pull/403>)
- Loading of custom project plugins (<https://github.com/openvinotoolkit/datumaro/issues/404>)
- Reading, writing anno file and saving name of the subset for test subset
(<https://github.com/openvinotoolkit/datumaro/pull/447>)

Security
- Fixed unsafe unpickling in CIFAR import (<https://github.com/openvinotoolkit/datumaro/pull/362>)

0.1.10

New features
- Support for import/export zip archives with images (<https://github.com/openvinotoolkit/datumaro/pull/273>)
- Subformat importers for VOC and COCO (<https://github.com/openvinotoolkit/datumaro/pull/281>)
- Support for KITTI dataset segmentation and detection format (<https://github.com/openvinotoolkit/datumaro/pull/282>)
- Updated YOLO format user manual (<https://github.com/openvinotoolkit/datumaro/pull/295>)
- `ItemTransform` class, which describes item-wise dataset `Transform`s (<https://github.com/openvinotoolkit/datumaro/pull/297>)
- `keep-empty` export parameter in VOC format (<https://github.com/openvinotoolkit/datumaro/pull/297>)
- A base class for dataset validation plugins (<https://github.com/openvinotoolkit/datumaro/pull/299>)
- Partial support for the Open Images format;
only images and image-level labels can be read/written
(<https://github.com/openvinotoolkit/datumaro/pull/291>,
<https://github.com/openvinotoolkit/datumaro/pull/315>).
- Support for Supervisely Point Cloud dataset format (<https://github.com/openvinotoolkit/datumaro/pull/245>, <https://github.com/openvinotoolkit/datumaro/pull/353>)
- Support for KITTI Raw / Velodyne Points dataset format (<https://github.com/openvinotoolkit/datumaro/pull/245>)
- Support for CIFAR-100 and documentation for CIFAR-10/100 (<https://github.com/openvinotoolkit/datumaro/pull/301>)

Enhancements
- Tensorflow AVX check is made optional in API and disabled by default (<https://github.com/openvinotoolkit/datumaro/pull/305>)
- Extensions for images in ImageNet_txt are now mandatory (<https://github.com/openvinotoolkit/datumaro/pull/302>)
- Several dependencies now have lower bounds (<https://github.com/openvinotoolkit/datumaro/pull/308>)

Deprecated
- TBD

Removed
- TBD

Bug fixes
- Incorrect image layout on saving and a problem with ecoding on loading (<https://github.com/openvinotoolkit/datumaro/pull/284>)
- An error when XPath filter is applied to the dataset or its subset (<https://github.com/openvinotoolkit/datumaro/issues/259>)
- Tracking of `Dataset` changes done by transforms (<https://github.com/openvinotoolkit/datumaro/pull/297>)
- Improved CLI startup time in several cases (<https://github.com/openvinotoolkit/datumaro/pull/306>)

Security
- Known issue: loading CIFAR can result in arbitrary code execution (<https://github.com/openvinotoolkit/datumaro/issues/327>)

0.1.9

Not secure
New features
- Support for escaping in attribute values in LabelMe format (<https://github.com/openvinotoolkit/datumaro/issues/49>)
- Support for Segmentation Splitting (<https://github.com/openvinotoolkit/datumaro/pull/223>)
- Support for CIFAR-10/100 dataset format (<https://github.com/openvinotoolkit/datumaro/pull/225>, <https://github.com/openvinotoolkit/datumaro/pull/243>)
- Support for COCO panoptic and stuff format (<https://github.com/openvinotoolkit/datumaro/pull/210>)
- Documentation file and integration tests for Pascal VOC format (<https://github.com/openvinotoolkit/datumaro/pull/228>)
- Support for MNIST and MNIST in CSV dataset formats (<https://github.com/openvinotoolkit/datumaro/pull/234>)
- Documentation file for COCO format (<https://github.com/openvinotoolkit/datumaro/pull/241>)
- Documentation file and integration tests for YOLO format (<https://github.com/openvinotoolkit/datumaro/pull/246>)
- Support for Cityscapes dataset format (<https://github.com/openvinotoolkit/datumaro/pull/249>)
- Support for Validator configurable threshold (<https://github.com/openvinotoolkit/datumaro/pull/250>)

Enhancements
- LabelMe format saves dataset items with their relative paths by subsets
without changing names (<https://github.com/openvinotoolkit/datumaro/pull/200>)
- Allowed arbitrary subset count and names in classification and detection
splitters (<https://github.com/openvinotoolkit/datumaro/pull/207>)
- Annotation-less dataset elements are now participate in subset splitting (<https://github.com/openvinotoolkit/datumaro/pull/211>)
- Classification task in LFW dataset format (<https://github.com/openvinotoolkit/datumaro/pull/222>)
- Testing is now performed with pytest instead of unittest (<https://github.com/openvinotoolkit/datumaro/pull/248>)

Deprecated
- TBD

Removed
- TBD

Bug fixes
- Added support for auto-merging (joining) of datasets with no labels and
having labels (<https://github.com/openvinotoolkit/datumaro/pull/200>)
- Allowed explicit label removal in `remap_labels` transform (<https://github.com/openvinotoolkit/datumaro/pull/203>)
- Image extension in CVAT format export (<https://github.com/openvinotoolkit/datumaro/pull/214>)
- Added a label "face" for bounding boxes in Wider Face (<https://github.com/openvinotoolkit/datumaro/pull/215>)
- Allowed adding "difficult", "truncated", "occluded" attributes when
converting to Pascal VOC if these attributes are not present (<https://github.com/openvinotoolkit/datumaro/pull/216>)
- Empty lines in YOLO annotations are ignored (<https://github.com/openvinotoolkit/datumaro/pull/221>)
- Export in VOC format when no image info is available (<https://github.com/openvinotoolkit/datumaro/pull/239>)
- Fixed saving attribute in WiderFace extractor (<https://github.com/openvinotoolkit/datumaro/pull/251>)

Security
- TBD

Page 5 of 7

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.