Wildlife-datasets

Latest version: v1.0.5

Safety actively analyzes 682229 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

1.0.5

New functionality

- Six new datasets were added: [AmvrakikosTurtles](https://www.kaggle.com/datasets/wildlifedatasets/amvrakikosturtles), [ReunionTurtles](https://www.kaggle.com/datasets/wildlifedatasets/reunionturtles), [SouthernProvinceTurtles](https://www.kaggle.com/datasets/wildlifedatasets/southernprovinceturtles), [ZakynthosTurtles](https://www.kaggle.com/datasets/wildlifedatasets/zakynthosturtles) (sea turtles), [ELPephants](https://inf-cv.uni-jena.de/home/research/datasets/elpephants/) (elephants) and [Chicks4FreeID](https://huggingface.co/datasets/dariakern/Chicks4FreeID) (chickens).
- The `WildlifeDataset` (formerly `DatasetFactory`) class was restructed and multiple methods were added from [wildlife-tools](https://github.com/WildlifeDatasets/wildlife-tools). This brings the benefits of:
- Simple way of adding datasets in other formats - [example](https://github.com/WildlifeDatasets/wildlife-datasets/blob/v1.0.5/wildlife_datasets/datasets/chicks4free_id.py) for the parquet format.
- Easier access to images via `dataset[0]` or automatically loading cropped images via instantiating `dataset(root, img_load='bbox')`.
- Significantly improved readability.
- We added links to [original publications](https://wildlifedatasets.github.io/wildlife-datasets/datasets/) whenever we were able to find them.

1.0.4

WildlifeReID-10k

- We released a new dataset [WildlifeReID-10k](https://www.kaggle.com/datasets/wildlifedatasets/wildlifereid-10k) depicting 214k images of 10k individuals.
- It is a collection of 30 existing re-identification datasets with permissible licenses.
- We incorporated default splits, retrained MegaDescriptor and included [baseline performance](https://www.kaggle.com/code/wildlifedatasets/wildlifereid-10k-overview).
- All scripts to [create and analyze WildlifeReID-10k](https://github.com/WildlifeDatasets/wildlife-datasets/tree/main/baselines) are publicly available.
- This dataset can now be accessed as `datasets.WildlifeReID10k`.

New functionality

- Added the newly released testing set for [BelugaID](https://lila.science/datasets/beluga-id-2022/). This dataset can now be accessed as `datasets.BelugaIDv2`.
- Updated links for several datasets hosted at [LILA BC](https://lila.science/datasets).
- To evaluate WildlifeReID-10k, we included new metrics `metrics.BAKS` and `metrics.BAUS`.
- To evaluate the difference between the random and similarity-aware splits, we added method `resplit_by_features` into `splits.BalancedSplit`.

1.0.3

New functionality

- Three new datasets were added: [CatIndividualImages](https://www.kaggle.com/datasets/timost1234/cat-individuals) (cats), [CowDataset](https://figshare.com/articles/dataset/data_set_zip/16879780) (cows) and [DogFaceNet](https://github.com/GuillaumeMougeot/DogFaceNet) (dogs).
- The license file is shown upon download.

Other changes

- Added orientation (left, right, ...) to [SeaTurtleID2022](https://www.kaggle.com/datasets/wildlifedatasets/seaturtleid2022).
- Fixed two segmentations in [NDD20](https://doi.org/10.25405/data.ncl.c.4982342) which may have caused problems during loading.
- Metadata including licenses and citations were updated or corrected in some cases.

1.0.2

New functionality

- Three new datasets were added: [MPDD](https://data.mendeley.com/datasets/v5j6m8dzhv/1) (dogs), [PolarBearVidID](https://zenodo.org/records/7564529) (polar bears) and [SeaStarReID2023](https://lila.science/sea-star-re-id-2023/) (sea stars).
- Labels were fixed for Cows2021 and FriesianCattle2015. These classes appear now as Cows2021v2 and FriesianCattle2015v2 to keep backward compability. A full list of changes is in [documentation](https://wildlifedatasets.github.io/wildlife-datasets/preprocessing/).
- Visual overhaul including new logos.

Breaking functionality

- Splits were removed from the dataframes. They may be [added manually](https://wildlifedatasets.github.io/wildlife-datasets/default_splits/).

Other changes

- SeaTurtleID was updated by [SeaTurtleID2022](https://www.kaggle.com/datasets/wildlifedatasets/seaturtleid2022).
- File names of images must contain only ASCII characters. Some files in IPanda50 were renamed while keeping backward compability.
- Some functions were refactored and `plot_grid` was updated.

1.0.0

Initial release

We are happy to introduce the first release of the wildlife-datasets library. Currently, the library can process 31 wildlife re-identification datasets. Its main feature is an interface for downloading and extracting datasets and converting them into a unified format. For full functionality see [documentation](https://wildlifedatasets.github.io/wildlife-datasets/).

Links

Releases

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.