Hudi

Latest version: v0.3.0

Safety actively analyzes 707299 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

0.3.0

๐Ÿš€ Features

- Define Hudi error types across hudi-core (124) by gohalo
- Support filter pushdown for datafusion (203) by jonathanc-n
- Add demo app and integration tests (226) by xushiyan
- Add `TimelineSelector` to support timeline loading (233) by xushiyan
- Add `hoodie.read.listing.parallelism` config (235) by xushiyan
- Support row filters for `FileGroupReader` (237) by xushiyan
- Implement incremental query for COW tables (236) by xushiyan
- Implement log file reader for parquet log block (244) by xushiyan
- Implement basic record merge semantics (249) by xushiyan
- Add APIs for MOR snapshot reads (247) by xushiyan
- Support time travel query for MOR tables (256) by xushiyan
- Support incremental read MOR tables (258) by xushiyan
- Support MOR read-optimized query (259) by xushiyan
- Support reading MOR with rollback (264) by xushiyan
- Align python table APIs with rust (267) by xushiyan
- Add APIs to support incremental query impl (272) by xushiyan

๐Ÿ› Bug Fixes

- Simplify partition filter format by taking tuple of strings (170) by kazdy
- Improve api to get file slices splits (185) by xushiyan
- Handle schema retrieval for datafusion api (187) by xushiyan
- Include commit_seqno for merge order (250) by xushiyan
- Format Hudi config enum should show the full config key (254) by Kunal-Singh-Dadhwal
- Derive record merge strategy based on table configs (260) by xushiyan
- Handle as-of timestamp for excluding file groups (268) by xushiyan
- Build up incremental file groups (273) by xushiyan

๐Ÿšœ Refactor

- Reorganize custom error types (215) by xushiyan
- Add API stubs for performing incremental queries (220) by xushiyan
- Enhance `Filter` and related structs (221) by xushiyan
- Improve `TimelineSelector` API (234) by xushiyan
- Improve `BaseFile` APIs (239) by xushiyan
- Improve file system view's listing flow (251) by xushiyan
- Use static MetaField schema for incr query (252) by xushiyan
- Rename crate `hudi-tests` to `hudi-test` (262) by xushiyan
- Remove use of `Filter` from public APIs (266) by xushiyan

๐Ÿ“š Documentation

- Update README examples (194) by xushiyan
- Update release and dev guides (195) by xushiyan
- Add example to `hudi-datafusion` crate (202) by jonathanc-n
- Add `CREATE EXTERNAL TABLE` example in datafusion crate (213) by jonathanc-n
- Clarify issues in the dev guide (224) by xushiyan
- Add in-code docs for `FileGroup` (269) by xushiyan
- Update `README.md` to show table API examples (274) by xushiyan

๐Ÿ› ๏ธ Build

- *(deps)* Bump codecov/codecov-action from 4 to 5 (184) by dependabot[bot]
- *(deps)* Upgrade datafusion and object store (182) by kazdy
- *(deps)* Upgrade datafusion to 42.2.0 (192) by xushiyan
- *(deps)* Upgrade Datafusion, Arrow, and Rust versions (197) by jonathanc-n
- *(deps)* Update pyo3 requirement from 0.22.2 to 0.22.4 (212) by jonathanc-n
- *(deps)* Clean up dependencies (240) by xushiyan
- *(dep)* Upgrade rustc, arrow, and tarpaulin setting (276) by xushiyan

โš™๏ธ Miscellaneous Tasks

- Update release script and guide (200) by xushiyan
- Update changelog for 0.2.0 (201) by xushiyan
- Update pull request guidelines for contributors (204) by jonathanc-n
- Add more dev commands and update the project's short description (217) by xushiyan
- Update codecov threshold (222) by xushiyan
- Update codecov config (245) by xushiyan
- Update codecov-action to v5 (248) by K-dash
- *(ci)* Add rust dependency caching with rust-cache action (265) by K-dash
- Fix src verify script (279)
- Update release guide and issue templates (282)

New Contributors

* K-dash made their first contribution in 265

* Kunal-Singh-Dadhwal made their first contribution in 254

* jonathanc-n made their first contribution in 203

<!-- generated by git-cliff -->

0.2.0

๐Ÿš€ Features

- Support loading hudi global configs (118) by zzhpro
- Add base file records' in-memory size to `FileStats` (140) by xushiyan
- Support partition prune api (119) by KnightChess
- Add partition filter arg in Python APIs (153) by xushiyan
- Add `HudiFileGroupReader` with consolidated APIs to read records (164) by xushiyan
- Add `TableBuilder` API for creating `Table` instances (163) by kazdy
- Implement datafusion `TableProviderFactory` (162) by kazdy

๐Ÿ› Bug Fixes

- Register object store with datafusion (107) by abyssnlp
- Handle validating table when `DropsPartitionFields` not present (142) by xushiyan
- Make partition loading more efficient (152) by xushiyan
- Simplify partition filter format by taking tuple of strings (170)
- Improve api to get file slices splits (185)
- Handle schema retrieval for datafusion api (187)

๐Ÿšœ Refactor

- Extract common test code for creating table (117) by gohalo
- Improve APIs for handling options (161) by xushiyan
- Improve `TableBuilder` API for taking single option (171) by xushiyan
- Minor improvement to fix coverage report status (173) by xushiyan

๐Ÿ“š Documentation

- Update readme logo and example (65) by xushiyan
- Update in-code comments (132) by KnightChess
- Add hudi core API docs with examples (113) by KnightChess
- Add in-code docs to hudi-core APIs (166) by xushiyan
- Add python binding docstrings (169) by kazdy
- Add step-by-step release guide (66) by xushiyan

๐ŸŽจ Styling

- Enforce Python code style (101) by muyihao

๐Ÿ› ๏ธ Build

- Use exact versions for arrow and datafusion (105) by xushiyan
- Bump up datafusion to version 41, arrow to 52.2 (120) by yjshen
- *(deps)* Update zip-extract requirement from 0.1.3 to 0.2.1 (130) by dependabot[bot]
- *(deps)* Upgrade datafusion, pyarrow, pyo3, python versions (149) by kazdy
- *(deps)* Upgrade arrow dependencies (168) by kazdy

0.2.0rc.2

โš™๏ธ Miscellaneous Tasks

- Improve release scripts (68) by xushiyan
- Add `CHANGELOG.md` with git-cliff config (69) by xushiyan
- Configure labeler for PRs from forked repos (83) by xushiyan
- Fix labeler config (85) by xushiyan
- Fix labeler config for dev-x (87) by xushiyan
- Merge python code coverage report with rust (67) by xushiyan
- Add pull request template (89) by xushiyan
- Enable dependabot (94) by xushiyan
- Add path ignore files for ci workflow (93) by abyssnlp
- Improve workflows for code checking and PR (110) by xushiyan
- Disable labeler due to permission and policy (115) by xushiyan
- *(ci)* Fix PR title linting to support change scope (138) by kazdy
- Add feature request template for GH issues (167) by kazdy

New Contributors

* KnightChess made their first contribution in 119

* gohalo made their first contribution in 117

* zzhpro made their first contribution in 118

* yjshen made their first contribution in 120

* abyssnlp made their first contribution in 107

* muyihao made their first contribution in 101

<!-- generated by git-cliff -->

0.2.0rc.1

- *(deps)* Upgrade datafusion and object store (182)
- *(deps)* Upgrade datafusion to 42.2.0 (192)

0.1.0

๐Ÿš€ Features

- Initial rust implementation to integrate with datafusion (1) by xushiyan
- Add python binding (21) by xushiyan
- Implement `HudiTable` as python API (23) by xushiyan
- Use `object_store` for common storage APIs (25) by xushiyan
- Implement Rust and Python APIs to read file slices (28) by xushiyan
- Add APIs for time-travel read (33) by xushiyan
- Implement datafusion API using ParquetExec (35) by xushiyan
- Add `HudiConfigs` for parsing and managing named configs (37) by xushiyan
- Add config validation when creating table (49) by xushiyan
- Add internal config to skip validation (51) by xushiyan
- Support time travel with read option (52) by xushiyan
- Support taking env vars for cloud storages (55) by xushiyan

๐Ÿ› Bug Fixes

- Handle replacecommit for loading file slices (53) by xushiyan

๐Ÿšœ Refactor

- Use `anyhow` for generic errors (26) by xushiyan
- Use `object_store` API for Timeline (27) by xushiyan
- Make APIs async (31) by xushiyan
- Improve thread safety and error handling (32) by xushiyan
- Improve error handling in storage module (34) by xushiyan
- Adjust table APIs to skip passing options (56) by xushiyan

๐Ÿ“š Documentation

- Update readme, contributing guide, and issue template (57) by xushiyan
- Update CONTRIBUTING with minor changes (58) by codope

๐ŸŽจ Styling

- Enforce rust code style (14) by xushiyan

๐Ÿ› ๏ธ Build

- Clean up and trim down dependencies (54) by xushiyan
- Add info for rust and python artifacts (60) by xushiyan
- Add release workflow (63) by xushiyan

๐Ÿงช Testing

- Add tests crate and adopt testing tables (30) by xushiyan
- Add test cases for different table setup (36) by xushiyan

โš™๏ธ Miscellaneous Tasks

- Setup ci for license file and headers (2) by xushiyan
- Fix failing check and test case (10) by xushiyan
- Fix asf notification (11) by xushiyan
- Add commit linting (12) by xushiyan
- Use cargo tarpaulin to generate code coverage (15) by xushiyan
- Remove codecov to keep ci green (17) by xushiyan
- Fix codecov setup (20) by xushiyan
- Configure codecov (50) by xushiyan
- Add scripts to streamline source release (64) by xushiyan

New Contributors

* codope made their first contribution in 58
* xushiyan made their first contribution in 1

<!-- generated by git-cliff -->

Links

Releases

ยฉ 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.