Daft

Latest version: v0.4.9

Safety actively analyzes 723200 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 14

0.4.3

What's Changed 🚀

- build(docs): Adds make docs phony target rchowell (3693)

✨ Features

- feat: Add a new dashboard UI to Daft raunakab (3738)
- feat(shuffles): Determination logic for pre shuffle merge colin-ho (3674)
- feat: Limit number of sources in merged scan task colin-ho (3695)
- feat: Expose parquet chunk size to swordfish reads colin-ho (3714)
- feat: add LiteralValue::Int8 and Int16 ugoa (3736)
- feat: with\_column(s)\_renamed expression for DataFrame jessie-young (3732)
- feat: Explain for swordfish colin-ho (3667)
- feat: Adds .describe() to DataFrame and DESCRIBE to SQL rchowell (3720)
- feat: Add column format option to iter rows colin-ho (3681)
- feat(core): make micropartition streamable over tables universalmind303 (3709)
- feat(iceberg): Adds support for read\_iceberg with metadata\_location to Daft-SQL rchowell (3701)
- feat: Overwrite partitions mode colin-ho (3687)
- feat(docs): Adds copy-to-clipboard to code samples rchowell (3702)
- feat(connect): sql universalmind303 (3696)
- feat: add binary string operations (length and concatenation) f4t4nt (3646)
- feat(sql): Adds url\_download and url\_upload to daft-sql rchowell (3690)
- feat(connect): distinct + sort universalmind303 (3677)
- feat(core): Implement null-safe equality operator f4t4nt (3663)
- feat(sql): Adds JsonScanBuilder to daft-scan and read\_json to daft-sql rchowell (3683)
- feat: support using S3Config.credentials\_provider for writes kevinzwang (3648)
- feat(sql): Adds FROM source check for string paths rchowell (3679)
- feat(connect): Rust ray exec universalmind303 (3666)

🐛 Bug Fixes

- fix: Set filter selectivity estimate lower bound colin-ho (3694)
- fix(join): joining on different types kevinzwang (3716)
- fix: to\_cnf and to\_dnf functions kevinzwang (3728)
- fix: pushdowns for unpivot universalmind303 (3724)
- fix(optimizer): Fix issues with join graph construction desmondcheongzx (3668)
- fix: Run filter null join key optimization once colin-ho (3657)

🚀 Performance

- perf: Track accumulated selectivity in logical plan to improve probe side decisions desmondcheongzx (3734)
- perf: simplify boolean expression rules kevinzwang (3731)
- perf(shuffles): Incrementally retrieve metadata in reduce colin-ho (3545)
- perf: Improve stats for join side determination colin-ho (3655)

♻️ Refactor

- refactor: remove eyre from daft-connect universalmind303 (3719)
- refactor(execution): NativeExecutor refactor universalmind303 (3689)
- refactor: logical op constructor+builder boundary kevinzwang (3684)
- refactor(connect): internal refactoring to make connect code more organized \& extensible universalmind303 (3680)

📖 Documentation

- docs: linked mkdocs \& api docs ccmao1130 (3703)
- docs: higher quality daft diagram for readme ccmao1130 (3697)
- docs: add daft launcher docs to docs v2 ccmao1130 (3678)

👷 CI

- ci: skip tests during publishing of release jaychia (3744)
- ci: Allow upstream git refs to be used for benchmarking desmondcheongzx (3730)
- ci: Remove daft tracing raunakab (3692)
- ci: Add new benchmarking cluster profile raunakab (3665)

🔧 Maintenance

- chore: Pin sql server version in docker compose colin-ho (3715)
- chore(connect): better error propagation \& handling universalmind303 (3675)
- chore(connect): consolidate multiple files in tests/connect universalmind303 (3676)

**Full Changelog**: https://github.com/Eventual-Inc/Daft/compare/v0.4.2...v0.4.3

0.4.2

What's Changed 🚀

- build: Publish A Long Term Support CPU Release of Daft samster25 (3650)

✨ Features

- feat(connect): `printSchema` andrewgazelka (3617)
- feat: Allow building probe table for either side of anti semi joins colin-ho (3643)
- feat(optimizer): Add join reordering as an optimizer rule desmondcheongzx (3642)
- feat(swordfish): Memory manager colin-ho (3599)
- feat(scantask-2): Implement new module for splitting Parquet ScanTask jaychia (3628)
- feat(scantask-1): add a config flag for new scantask splitting algorithm jaychia (3615)
- feat: Support intersect all and except distinct/all in DataFrame API advancedxy (3537)
- feat: support new PyIceberg IO properties and custom IOConfig in write\_iceberg kevinzwang (3633)
- feat(expressions): Extend Expression.url.upload() to support row-specific URLs desmondcheongzx (3518)

🐛 Bug Fixes

- fix: special characters in GCS urls kevinzwang (3651)
- fix(swordfish): Track future poll times for explain analyze colin-ho (3511)

👷 CI

- ci: Improve visualization of tpcds + tpch benchmarking outputs raunakab (3654)

🔧 Maintenance

- chore: update PyO3 version to 0.23 kevinzwang (3647)
- chore: Fix parquet benchmark test colin-ho (3632)
- chore: Clean up join order iteration desmondcheongzx (3638)

⬆️ Dependencies

- build(deps-dev): bump moto[s3,server] from 5.0.21 to 5.0.26 dependabot (3640)

**Full Changelog**: https://github.com/Eventual-Inc/Daft/compare/v0.4.1...v0.4.2

0.4.1

What's Changed 🚀

✨ Features

- feat(optimizer): Implement naive join ordering desmondcheongzx (3616)
- feat(connect): add more unresolved functions andrewgazelka (3618)
- feat(connect): `with_columns_renamed` andrewgazelka (3386)
- feat(connect): read/write → csv, write → json andrewgazelka (3361)

🐛 Bug Fixes

- fix: unity catalog import from write\_deltalake jaychia (3630)

🚀 Performance

- perf(optimizer): convert filter predicate to CNF to push through join kevinzwang (3623)

📖 Documentation

- docs: daft documentation v2 ccmao1130 (3595)

✅ Tests

- test(connect): verify `show()`output andrewgazelka (3610)

👷 CI

- ci: Output results in a CSV format raunakab (3625)
- ci: Add build step to run-cluster raunakab (3606)

🔧 Maintenance

- chore: Build progress bar only on first update colin-ho (3626)
- chore: Fix csv benchmark test colin-ho (3631)

**Full Changelog**: https://github.com/Eventual-Inc/Daft/compare/v0.4.0...v0.4.1

0.4.0

What's Changed 🚀

- build: Use uv for maturin builds instead raunakab (3540)

💥 Breaking Changes

- feat: Default native runner colin-ho (3608)
- chore!: upgrade Ray pins and pyarrow pins jaychia (3612)
- chore!: drop support for Python 3.8 kevinzwang (3592)
- chore!: remove pyarrow-based file reader kevinzwang (3587)

✨ Features

- feat: Default native runner colin-ho (3608)
- feat(swordfish): Progress Bar colin-ho (3571)
- feat(connect): df.show universalmind303 (3560)
- feat(connect): support `DdlParse` andrewgazelka (3580)
- feat(swordfish): Optimize grouped aggregations colin-ho (3534)
- feat(swordfish): Enable left/right joins to build probe table on either side colin-ho (3548)
- feat: Add DataType inference from Python types jaychia (3555)
- feat(shuffles): Locality aware pre shuffle merge colin-ho (3505)
- feat: Implement count-distinct for sql raunakab (3553)
- feat(connect): add drop support andrewgazelka (3345)
- feat: support for basic subquery execution kevinzwang (3536)
- feat(connect): add `df.filter` andrewgazelka (3346)
- feat: Make serialization code not unwrap and panic on failures raunakab (3546)
- feat: Unity Catalog writes using `daft.DataFrame.write_deltalake()` anilmenon14 (3522)
- feat(connect): add parquet support andrewgazelka (3360)
- feat: Add iterators to more types raunakab (3539)
- feat(optimizer): Add scaffolding to create join graphs from logical plans desmondcheongzx (3501)
- feat(tpcds-benchmarking): Add basic tpcds benchmarking for local testing raunakab (3509)
- feat(list): add fixed-size list support for value\_counts andrewgazelka (3521)
- feat(parquet): Limit parallel tasks in remote parquet reader colin-ho (3490)
- feat(parquet): Target parquet writes by size bytes instead of rows colin-ho (3457)
- feat: cross join kevinzwang (3437)
- [FEAT] connect: remove excessive warnings from spark connect universalmind303 (3499)
- [CHORE] connect, test: `df.withColumn` andrewgazelka (3359)
- [FEAT]: expr simplifier universalmind303 (3393)
- [FEAT] shuffle testing raunakab (3492)
- [FEAT]: add `coalesce` to dataframe and SQL universalmind303 (3482)
- [FEAT] add register-table helper to sql-catalog chuanlei-coding (2837)
- [FEAT] Respect resource request for projections in swordfish colin-ho (3460)
- [FEAT] Enable Actor Pool UDFs by default kevinzwang (3488)
- [FEAT] connect: add modulus operator and withColumns support andrewgazelka (3351)
- [FEAT] connect: createDataFrame andrewgazelka (3363)
- [FEAT] Support parquet RLE decoding for booleans desmondcheongzx (3477)
- [FEAT] Cap parallelism on local parquet reader colin-ho (3310)
- [FEAT] connect: add binary operators andrewgazelka (3350)
- [FEAT] connect: support basic column operations andrewgazelka (3362)
- [FEAT] extend `build-commit` workflow to support different compile-archs raunakab (3459)
- [FEAT] Add `count-distinct` aggregation raunakab (3455)

🐛 Bug Fixes

- fix(udf): udf call with empty table and batch size kevinzwang (3604)
- fix: use arrow's schema instead of spark's for local rel universalmind303 (3602)
- fix: guard concurrent extension datatype setting with a lock jaychia (3589)
- fix(parquet): Fix parquet reads of required fields nested within optional fields desmondcheongzx (3598)
- fix: boolean and/or expressions with null kevinzwang (3544)
- fix(run-cluster-workflow): Add null check when parsing metadata raunakab (3507)
- fix(tpcds): fix bugs in tpcds datagen script universalmind303 (3495)
- [BUG] Fix build commit workflow raunakab (3487)
- [BUG]: dont panic on count(distinct) universalmind303 (3481)
- [BUG] Block on parquet schema future in estimate\_size\_bytes colin-ho (3484)

🚀 Performance

- perf: filter null join key optimization rule kevinzwang (3583)
- perf: lazily import pyiceberg and unity catalog if available jaychia (3565)

♻️ Refactor

- refactor: allow InMemory to take in non python based entries universalmind303 (3554)
- refactor: create a rust based `PartitionSet` universalmind303 (3515)
- refactor(swordfish): Generic broadcast state bridge colin-ho (3508)

📖 Documentation

- docs: update tpch benchmark link ccmao1130 (3542)
- docs: Enable Linting of docstrings samster25 (3506)
- [FEAT] Enable Actor Pool UDFs by default kevinzwang (3488)

✅ Tests

- test(connect): add more tests for `createDataFrame` andrewgazelka (3607)
- test: Add more size estimation tests from our s3 bucket jaychia (3514)

👷 CI

- ci: Always download logs jaychia (3588)
- ci: Add ability to array-ify args and run multiple jobs raunakab (3584)
- ci: Add "build" label type to accepted PR titles raunakab (3541)
- ci: add a tool to launch workloads on cluster jaychia (3516)
- ci(release-drafter): use conventional commit labels andrewgazelka (3503)

🔧 Maintenance

- chore!: upgrade Ray pins and pyarrow pins jaychia (3612)
- chore: add warning for native runner jaychia (3613)
- chore!: drop support for Python 3.8 kevinzwang (3592)
- chore!: remove pyarrow-based file reader kevinzwang (3587)
- chore: Fix ordering in sql tests + pin docker images in read\_sql tests colin-ho (3596)
- chore: move symbolic and boolean algebra code into new crate kevinzwang (3570)
- [CHORE] use conventional commits andrewgazelka (3493)
- [CHORE] connect, test: `df.withColumn` andrewgazelka (3359)
- [CHORE] Add tests for parquet size estimations jaychia (3405)
- [CHORE] Move all python wrapping logic to separate module raunakab (3458)

**Full Changelog**: https://github.com/Eventual-Inc/Daft/compare/v0.3.15...v0.3.16

0.3.15

Changes

✨ New Features

- [FEAT] run cluster on commit raunakab (3461)
- [FEAT]: Support `.clip` function conradsoon (3136)
- [FEAT] Add cluster profiles raunakab (3426)
- [FEAT] add pyiceberg 0.8.0 support rongfengliang (3448)
- [FEAT] migrate schema inference → async, block at py boundary andrewgazelka (3432)
- [CHORE] connect: `df.schema` andrewgazelka (3353)
- [CHORE] connect test: `df.get_attr` andrewgazelka (3349)
- [FEAT] Get native execution enablement from DAFT\_RUNNER desmondcheongzx (3409)
- [FEAT] Add ability to download log files from ray-cluster raunakab (3406)
- [FEAT] Add ability to run arbitrary command on a set working directory raunakab (3404)
- [FEAT] Add steps to spin up, submit job, and spin down ray clusters raunakab (3403)
- [CHORE] connect: add tests for `df.take()` method andrewgazelka (3385)
- [FEAT] Create new run workflow raunakab (3402)
- [FEAT] Enable group by keys in aggregation expressions kevinzwang (3399)
- [FEAT] Build release python wheels and upload to AWS S3 raunakab (3398)
- [FEAT] connect: Add support for `select` andrewgazelka (3344)
- [FEAT] connect: add `df.limit` and `df.first` andrewgazelka (3309)
- [FEAT] connect: `to_daft_*` use ref instead of value andrewgazelka (3355)
- [FEAT] connect: add alias support andrewgazelka (3342)
- [FEAT] Filter predicates in SQL join kevinzwang (3371)
- [FEAT] connect: collect andrewgazelka (3326)

🚀 Performance Improvements

- [PERF] Improve hash table probe side decisions for Swordfish desmondcheongzx (3327)

👾 Bug Fixes

- [BUG] Fix extension type display jaychia (3456)
- [BUG] Remove enum imports from match statements raunakab (3436)
- [BUG] Explicitly set IO config in unity catalog load table colin-ho (3453)
- [BUG] Include storage options in lance write commit colin-ho (3451)
- [BUG] Replace semicolons in filenames with underscore raunakab (3430)
- [BUG] Terminate nodes instead of stopping them raunakab (3427)
- [BUG] Fix run-cluster passing in environment variables wrongly jaychia (3422)

📖 Documentation

- [FEAT]: Support `.clip` function conradsoon (3136)
- [DOCS] Shorten union of Literals desmondcheongzx (3449)
- [DOCS] Add missing list expression entries desmondcheongzx (3428)

🧰 Maintenance

- [CHORE] Add warning in PyRunner to switch to Native colin-ho (3472)
- [CHORE] Address comments on previous PR raunakab (3473)
- [CHORE] Write tpch parquet files one at a time colin-ho (3396)
- [CHORE] Remove CountMode and ResourceRequest from public API desmondcheongzx (3429)
- [CHORE] Add schemas for remaining local plan ops colin-ho (3446)
- [CHORE] Put empty table when building probe table colin-ho (3445)
- [CHORE] Explain block\_on function in common-runtime colin-ho (3442)
- [CHORE] connect: `df.schema` andrewgazelka (3353)
- [CHORE] Update execution config to turn on Ray tracing jaychia (3431)
- [CHORE] connect test: `df.get_attr` andrewgazelka (3349)
- [CHORE] Cleanup ExprResolver kevinzwang (3401)
- [CHORE] connect: add tests for `df.take()` method andrewgazelka (3385)
- [CHORE] Change IOConfig to be serialized into binary instead of JSON kevinzwang (3400)
- [CHORE] Pin PyIceberg version to \<0.8 kevinzwang (3391)
- [CHORE] Add TPC-H queries in SQL kevinzwang (3392)
- [CHORE] connect: Optimize plans in connect colin-ho (3378)
- [CHORE] delete empty file xyz andrewgazelka (3370)

⬆️ Dependencies

<details>
<summary>14 changes</summary>

- Bump orjson from 3.10.11 to 3.10.12 dependabot (3464)
- Bump grpcio from 1.67.0 to 1.68.1 dependabot (3465)
- Bump arrow-buffer from 51.0.0 to 53.3.0 dependabot (3467)
- Bump regex-syntax from 0.7.5 to 0.8.4 dependabot (3468)
- Bump memmap2 from 0.9.4 to 0.9.5 dependabot (3470)
- Bump image from 0.25.4 to 0.25.5 dependabot (3471)
- Bump bytes from 1.7.1 to 1.8.0 dependabot (3411)
- Bump astral-sh/setup-uv from 3 to 4 dependabot (3410)
- Bump serde\_json from 1.0.124 to 1.0.133 dependabot (3413)
- Bump sample-arrow2 from 0.1.0 to 0.17.2 dependabot (3414)
- Bump chrono-tz from 0.8.6 to 0.10.0 dependabot (3415)
- Bump azure-storage-blob from 12.17.0 to 12.24.0 dependabot (3416)
- Bump opencv-python from 4.8.1.78 to 4.10.0.84 dependabot (3417)
- Bump sqlalchemy from 2.0.25 to 2.0.36 dependabot (3418)
</details>

0.3.14

Changes

✨ New Features

- [FEAT]: sql HAVING universalmind303 (3364)
- [FEAT] consolidate Spark session fixture into conftest.py andrewgazelka (3341)
- [FEAT]: allow for implicit coercion between str \& date universalmind303 (3337)
- [FEAT] daft-connect range use python generator andrewgazelka (3308)
- [FEAT] Monotonically Increasing Id for Swordfish colin-ho (3180)
- [FEAT] Support for correlated subqueries in SQL (not yet executable) kevinzwang (3304)
- [FEAT]: SQL read\_csv itzhakstern (3255)
- [FEAT] Daft Catalog API jaychia (3036)
- [FEAT]: allow `is_in` to take in `Vec<Expr>` instead of `Expr` universalmind303 (3294)
- [FEAT] Lance writes for swordfish colin-ho (3299)
- [FEAT] Support for aggregation expressions that use multiple AggExprs kevinzwang (3296)
- [FEAT] SQL union/union all and sql intersect universalmind303 (3274)

👾 Bug Fixes

- [BUG] Implement deserialize for Python objects serialized as sequences kevinzwang (3339)
- [BUG]: tbl alias with join universalmind303 (3333)
- [BUG] Fixes regexp\_replace expression ConeyLiu (3306)
- [BUG] Fix ray wait in RayPartitionSet jaychia (3251)
- [BUG] Partially qualified joins `join a.x = y` and `join x = b.y` universalmind303 (3290)
- [BUG] Check env in benchmarking script colin-ho (3297)
- [BUG] Fix writes for empty dataframes if target directory does not exist colin-ho (3278)
- [BUG]: panic in sql subquery universalmind303 (3291)

📖 Documentation

- [DOCS] Fix typo in limit example colin-ho (3303)
- [DOCS] Update incomplete SQL doc pages willvo2004 (3298)

🧰 Maintenance

- [CHORE]: prepare for nulls first/last kernels universalmind303 (3301)
- [CHORE] Fix join alias test kevinzwang (3335)
- Bump bytemuck from 1.16.3 to 1.19.0 dependabot (3171)
- [CHORE]: move utf8 functions from daft-dsl to daft-functions ConeyLiu (3101)
- [CHORE] Swordfish refactors colin-ho (3256)
- [CHORE]: better subquery handling universalmind303 (3295)
- [CHORE] Possibility to create environment with system installed uv maruschin (3281)
- [CHORE]: defer Expr subquery error until eval universalmind303 (3272)
- [CHORE] Fix style in workflow file jaychia (3284)

⬆️ Dependencies

<details>
<summary>12 changes</summary>

- Bump bytemuck from 1.16.3 to 1.19.0 dependabot (3171)
- Bump psycopg2-binary from 2.9.9 to 2.9.10 dependabot (3174)
- Bump codecov/codecov-action from 3 to 5 dependabot (3318)
- Bump slackapi/slack-github-action from 1.27.0 to 2.0.0 dependabot (3317)
- Bump lxml from 5.1.0 to 5.3.0 dependabot (3172)
- Bump moto[s3,server] from 5.0.2 to 5.0.21 dependabot (3312)
- Bump async-stream from 0.3.5 to 0.3.6 dependabot (3313)
- Bump unicode-normalization from 0.1.23 to 0.1.24 dependabot (3314)
- Bump tikv-jemallocator from 0.5.4 to 0.6.0 dependabot (3316)
- Bump sysinfo from 0.30.13 to 0.32.0 dependabot (3168)
- Bump pretty\_assertions from 1.4.0 to 1.4.1 dependabot (3169)
- Bump lz4 from 1.26.0 to 1.28.0 dependabot (3170)
</details>

Page 2 of 14

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.