Datachain

Latest version: v0.8.3

Safety actively analyzes 693883 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 10 of 11

0.2.16

What's Changed
* improve efficiency of examples by mattseddon in https://github.com/iterative/datachain/pull/214
* fix select then distinct chain by mattseddon in https://github.com/iterative/datachain/pull/213
* rename DataChain's create_empty to from_records by mattseddon in https://github.com/iterative/datachain/pull/215
* do not modify datachain max limit in show by mattseddon in https://github.com/iterative/datachain/pull/225
* Rename cleanup_temp_tables to cleanup_tables in warehouse and catalog by amritghimire in https://github.com/iterative/datachain/pull/218


**Full Changelog**: https://github.com/iterative/datachain/compare/0.2.15...0.2.16

0.2.15

What's Changed
* Arrow improvements by dberenbaum in https://github.com/iterative/datachain/pull/126
* prevent cryptic error messages when running llm claude examples by mattseddon in https://github.com/iterative/datachain/pull/194
* remove reference to missing notebook by mattseddon in https://github.com/iterative/datachain/pull/193
* Fix for nested lists of models in schema by ilongin in https://github.com/iterative/datachain/pull/195
* Support for `DataChain.batch_map()` by dberenbaum in https://github.com/iterative/datachain/pull/191
* renamed clip.py by dberenbaum in https://github.com/iterative/datachain/pull/201
* validation error handling improved by volkfox in https://github.com/iterative/datachain/pull/203
* modified Mistral prompt and changed to the DataModel by volkfox in https://github.com/iterative/datachain/pull/205
* make datachain show respect existing limit by mattseddon in https://github.com/iterative/datachain/pull/206
* JSON tutorial by volkfox in https://github.com/iterative/datachain/pull/207
* [pre-commit.ci] pre-commit autoupdate by pre-commit-ci in https://github.com/iterative/datachain/pull/197
* Move 'create_pre_udf_table' function to warehouse module by dreadatour in https://github.com/iterative/datachain/pull/187

New Contributors
* pre-commit-ci made their first contribution in https://github.com/iterative/datachain/pull/197

**Full Changelog**: https://github.com/iterative/datachain/compare/0.2.14...0.2.15

0.2.14

What's Changed
* Fixing test warnings by dtulga in https://github.com/iterative/datachain/pull/158
* Update DataChain.subtract() to work without legacy file signals by rlamy in https://github.com/iterative/datachain/pull/157
* Optimizations: low-hanging fruits by dreadatour in https://github.com/iterative/datachain/pull/178
* from_values with array of arrays by dmpetrov in https://github.com/iterative/datachain/pull/183
* remove collect_one from example by mattseddon in https://github.com/iterative/datachain/pull/186
* fixing a missed import for codegen schemas by volkfox in https://github.com/iterative/datachain/pull/184
* avoid instantiating filesystem for path operations by skshetry in https://github.com/iterative/datachain/pull/176
* Remove get_possibly_stale_jobs from metastore by amritghimire in https://github.com/iterative/datachain/pull/189

New Contributors
* amritghimire made their first contribution in https://github.com/iterative/datachain/pull/189

**Full Changelog**: https://github.com/iterative/datachain/compare/0.2.13...0.2.14

0.2.13

What's Changed

* DataChain.from_storage: add last_modified and is_latest to the columns by skshetry in https://github.com/iterative/datachain/pull/165
* fix for using new column from `.mutate()` in `.order_by()` by ilongin in https://github.com/iterative/datachain/pull/171
* Renaming `File.write()` to `File.save()` by ilongin in https://github.com/iterative/datachain/pull/172
* storage: index as a dir if no glob by shcheklein in https://github.com/iterative/datachain/pull/108

Docs

* first shot at LLM eval tutorial by volkfox in https://github.com/iterative/datachain/pull/145
* Update README.rst by volkfox in https://github.com/iterative/datachain/pull/161
* docs: update README.rst by eltociear in https://github.com/iterative/datachain/pull/168

Maintenance

* skip mypy hook on pre-commit.ci by skshetry in https://github.com/iterative/datachain/pull/164
* ci: disable azure and gs remote tests on macOS by skshetry in https://github.com/iterative/datachain/pull/174
* ci: run s3 tests on Windows, be more careful while skipping by skshetry in https://github.com/iterative/datachain/pull/175
* fix test for ch wrt datetime precision by skshetry in https://github.com/iterative/datachain/pull/169
* Adding tests for exporting image files and `File.write()` by ilongin in https://github.com/iterative/datachain/pull/149

New Contributors
* eltociear made their first contribution in https://github.com/iterative/datachain/pull/168

**Full Changelog**: https://github.com/iterative/datachain/compare/0.2.12...0.2.13

0.2.12

What's Changed
* Python API to manage the dataset registry by dreadatour in https://github.com/iterative/datachain/pull/29
* cli: hide subcommands from the listing by skshetry in https://github.com/iterative/datachain/pull/79
* datachain: rename include_sys kwarg to sys by skshetry in https://github.com/iterative/datachain/pull/69
* Adding `DataChain.export_files(...)` by ilongin in https://github.com/iterative/datachain/pull/30
* Update cv tutorial: `fashion_product_images` by mnrozhkov in https://github.com/iterative/datachain/pull/62
* Add and clean up docstrings in datachain api by dberenbaum in https://github.com/iterative/datachain/pull/63
* docs: fix invalid python code inside docstrings by skshetry in https://github.com/iterative/datachain/pull/85
* Hide traceback for xfails in Studio test runs by rlamy in https://github.com/iterative/datachain/pull/87
* Rename UDF to UDFStep for clarity, and remove from root namespace by rlamy in https://github.com/iterative/datachain/pull/88
* Fix mutate() by dmpetrov in https://github.com/iterative/datachain/pull/78
* update pytest-servers to 0.5.5 by mattseddon in https://github.com/iterative/datachain/pull/94
* Remove vendored-code-specific folders by dtulga in https://github.com/iterative/datachain/pull/95
* Rename repository references to datachain by dtulga in https://github.com/iterative/datachain/pull/93
* do not overwrite version with None in DatasetQuery constructor by mattseddon in https://github.com/iterative/datachain/pull/92
* always include sys signals by skshetry in https://github.com/iterative/datachain/pull/81
* Add more UniqueId fields by rlamy in https://github.com/iterative/datachain/pull/90
* Added more generalize `SignalsSchema.;get_signals()` method instead of `get_file_signals(...)` by ilongin in https://github.com/iterative/datachain/pull/86
* Added input params to `distinct()` by ilongin in https://github.com/iterative/datachain/pull/96
* Fix for `order_by` with sub signals by ilongin in https://github.com/iterative/datachain/pull/82
* Remove legacy signals in from_storage() by rlamy in https://github.com/iterative/datachain/pull/72
* Updates to examples by dberenbaum in https://github.com/iterative/datachain/pull/77
* More docs updates by dberenbaum in https://github.com/iterative/datachain/pull/100
* Add 'update' param to DataChain.from_storage method by dreadatour in https://github.com/iterative/datachain/pull/99
* Fix repository reference in Notebook by dtulga in https://github.com/iterative/datachain/pull/105
* fix(ux): remove reference to DatasetQuery by shcheklein in https://github.com/iterative/datachain/pull/104
* datachain: implement to_parquet by skshetry in https://github.com/iterative/datachain/pull/97
* File refactor by dberenbaum in https://github.com/iterative/datachain/pull/102
* fixing regressions from switching to ModelStore.add() by volkfox in https://github.com/iterative/datachain/pull/109
* add ModelStore to top level imports by dmpetrov in https://github.com/iterative/datachain/pull/112
* add truncate option to show and update default width of output by mattseddon in https://github.com/iterative/datachain/pull/116
* merge/join: exclude sys signals by skshetry in https://github.com/iterative/datachain/pull/120
* Added `descending` parameter to `DataChain.order_by(...)` by ilongin in https://github.com/iterative/datachain/pull/122
* remove get_value() from DataModel by dmpetrov in https://github.com/iterative/datachain/pull/119
* Add file modes for binary/text by dberenbaum in https://github.com/iterative/datachain/pull/107
* remove docstring from DataModel.__pydantic__init_subclass__ by skshetry in https://github.com/iterative/datachain/pull/123
* Examples cleanup by dberenbaum in https://github.com/iterative/datachain/pull/111
* rename ModelStore.add() to register() by dmpetrov in https://github.com/iterative/datachain/pull/113
* datachain: generalize data access functions into collect(), and collect_flatten by skshetry in https://github.com/iterative/datachain/pull/121
* Add nrows for partial parsing of csv/parquet by dberenbaum in https://github.com/iterative/datachain/pull/124
* Update index.md by volkfox in https://github.com/iterative/datachain/pull/128
* Picture for getting started by volkfox in https://github.com/iterative/datachain/pull/127
* moving pic to the right place by volkfox in https://github.com/iterative/datachain/pull/131
* cleanup signal refs in examples by dberenbaum in https://github.com/iterative/datachain/pull/129
* cleanup api reference index by dberenbaum in https://github.com/iterative/datachain/pull/130
* Fix for text and images files export by ilongin in https://github.com/iterative/datachain/pull/135
* update computer vision quick start example by mattseddon in https://github.com/iterative/datachain/pull/136
* update computer vision image example by mattseddon in https://github.com/iterative/datachain/pull/139
* Huggingface test updates and bug fix by dberenbaum in https://github.com/iterative/datachain/pull/140
* Readme update by dmpetrov in https://github.com/iterative/datachain/pull/133
* readme: fix link to image by dmpetrov in https://github.com/iterative/datachain/pull/143
* Update badge by skshetry in https://github.com/iterative/datachain/pull/144
* don't depend on datachain from PATH to exec processes by skshetry in https://github.com/iterative/datachain/pull/118
* dc: try to fix dataset_stats for DataChain.from_storage() generated dataset by skshetry in https://github.com/iterative/datachain/pull/151

New Contributors
* dreadatour made their first contribution in https://github.com/iterative/datachain/pull/29
* mnrozhkov made their first contribution in https://github.com/iterative/datachain/pull/62

**Full Changelog**: https://github.com/iterative/datachain/compare/0.2.11...0.2.12

0.2.11

What's Changed
* cleanup model store/registry by dberenbaum in https://github.com/iterative/datachain/pull/74
* slice nested signals by dberenbaum in https://github.com/iterative/datachain/pull/75
* To pandas - hierarchical multi header by dmpetrov in https://github.com/iterative/datachain/pull/22
* Use cloudpickle for parallel UDF processing by dtulga in https://github.com/iterative/datachain/pull/65


**Full Changelog**: https://github.com/iterative/datachain/compare/0.2.10...0.2.11

Page 10 of 11

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.