Daft

Latest version: v0.4.8

Safety actively analyzes 722581 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 14

0.4.8

What's Changed 🚀

✨ Features

- feat: syntactic sugar for Python list and struct gets kevinzwang (4027)
- feat: Add a memory-efficient iterator for Series desmondcheongzx (4006)
- feat(catalog): adds s3tables iceberg rest endpoint rchowell (4018)
- feat: adds gz as gzip alias for encode, decode methods rchowell (4020)
- feat: Functions: sign, signum, negative, negate petern48 (3941)
- feat(sql): namespace support with in-memory catalog rchowell (4013)
- feat(sql): adds show tables statement and documentation rchowell (4011)
- feat(catalog): adds native s3tables read and catalog apis rchowell (3929)
- feat: offset indices in sparse tensor itzhakstern (3725)
- feat: Flight shuffle colin-ho (3904)
- feat: daft.range function universalmind303 (3956)
- feat: cast using a string type universalmind303 (3951)

🐛 Bug Fixes

- fix: Fix join condition swaps when left/right sides swap desmondcheongzx (4028)
- fix: Fix boolean expression simplifier desmondcheongzx (4016)
- fix: Fix list sort with groupby desmondcheongzx (3990)
- fix: datetime deprecation universalmind303 (3987)
- fix: Fix incorrect numeric identity optimizations desmondcheongzx (3988)
- fix: tutorial code kevinzwang (3972)
- fix: allow decimal precision equal to scale rchowell (3973)
- fix: Add more retries to sql server connection in test colin-ho (3953)
- fix(ci): distributed tpch benchmark kevinzwang (3967)
- fix: depend on pylance instead of lancedb kevinzwang (3962)
- fix(ci): slack failure notification parameters kevinzwang (3952)
- fix: fix error when casting monotonically\_increasing\_id directly f4t4nt (3950)
- fix: Add target dialect when making subquery in read\_sql colin-ho (3948)
- fix: Count bytes read correctly for local WARC reads desmondcheongzx (3946)
- fix: Pass CommitProperties object custom metadata in deltalake tkauf15k (3914)
- fix: iceberg table name is a method rchowell (3949)

🚀 Performance

- perf: Enable join reordering colin-ho (4029)
- perf: Favor smaller relations on the left for join ordering desmondcheongzx (4003)
- perf: Refactor selectivity estimates colin-ho (4010)

📖 Documentation

- docs: update install instructions for daft-lts and nightly kevinzwang (4026)
- docs: Fix s3 tables docs colin-ho (4025)
- docs: change all mentions of getdaft -> daft jaychia (3986)
- docs: fix docs examples and add missing docs kevinzwang (3974)
- docs: initializes sql and data type documentation rchowell (3959)

👷 CI

- ci: update distributed tpch benchmark kevinzwang (3971)
- ci: fix typo in nightly workflow kevinzwang (3968)
- ci: distributed TPC-H benchmarks kevinzwang (3961)

🔧 Maintenance

- chore: Track imports on scarf colin-ho (4024)
- chore: Upgrade kanal to 0.1 colin-ho (4017)
- chore: create in-memory scans using rust arrow arrays rchowell (4005)
- chore(dashboard): update Next.js dependency to version 15.2.2 universalmind303 (3999)
- chore: add pr template ccmao1130 (3981)
- chore: dashboard build cleanup universalmind303 (3931)
- chore: fix slack link in readme kevinzwang (3966)
- chore: Favor OnceLock over lazy\_static for WARC column sizes desmondcheongzx (3939)

**Full Changelog**: https://github.com/Eventual-Inc/Daft/compare/v0.4.7...v0.4.8

0.4.7

What's Changed 🚀

- build: build and publish daft package kevinzwang (3913)
- build: bump rust toolchain version kevinzwang (3910)

✨ Features

- feat: adds encode and decode for deflate, gzip, zlib rchowell (3907)
- feat(catalog): adds catalog ddl actions like create\_table and create\_namespace rchowell (3902)
- feat(sql): adds the 'use' sql session statement rchowell (3912)
- feat(catalog): adds append and overwrite to table apis rchowell (3889)
- feat(catalog): adds additional table sources for Catalog.from\_pydict rchowell (3901)
- feat: functions sinh, cosh, tanh petern48 (3903)
- feat: Functions log1p and expm1 petern48 (3887)
- feat: trig functions csc and sec petern48 (3884)

🐛 Bug Fixes

- fix: nightly build and local tpch benchmark workflow kevinzwang (3898)
- fix: add retry to getting GCS client config kevinzwang (3930)
- fix: bun install in build-wheel.yml kevinzwang (3932)
- fix: allow resolving tables at catalog root rchowell (3928)
- fix: Don't use `_position_to_field_name` Fokko (3917)
- fix: write\_lance append mode when storage\_options required ascillitoe (3924)
- fix(dashboard): get dashboard working again universalmind303 (3918)
- fix: coalesce panics, supertype handling, and null handling bugs rchowell (3908)
- fix: small fix for pyspark+ray. universalmind303 (3899)
- fix: map.get on empty dataset universalmind303 (3892)
- fix: remove dashboard imports and dep samster25 (3888)

🚀 Performance

- perf: Reduce memory consumption for WARC reads and improve estimates desmondcheongzx (3935)

📖 Documentation

- docs: adds additional catalog and session documentation rchowell (3926)
- docs: add spark connect doc page universalmind303 (3919)
- docs: adds a usage doc for catalogs rchowell (3878)
- docs: Add documentation for functions module f4t4nt (3880)
- docs: remove cairo ccmao1130 (3900)

👷 CI

- ci: update all --release workflows universalmind303 (3915)
- ci: replace build-artifact-s3 with new workflow, add local tpch benches kevinzwang (3864)

🔧 Maintenance

- chore: use ref name instead of ref in tpch bench metadata kevinzwang (3937)
- chore: use stdlib importlib.metadata for python>3.9 kevinzwang (3916)
- chore: move dashboard in to main project universalmind303 (3909)
- chore: make dashboard assets part of build process. universalmind303 (3905)

**Full Changelog**: https://github.com/Eventual-Inc/Daft/compare/v0.4.6...v0.4.7

0.4.6

What's Changed 🚀

✨ Features

- feat: Add WARC reader desmondcheongzx (3871)
- feat(functions): add monotonically\_increasing\_id expression function f4t4nt (3838)
- feat: union ops universalmind303 (3872)
- feat: Enable capturing and broadcasting logs when running on the `Native` runner raunakab (3875)
- feat(connect): joins universalmind303 (3849)

🐛 Bug Fixes

- fix: Add check for numpy in from\_pylist colin-ho (3881)
- fix: Fix ray data link colin-ho (3874)
- fix: arrow to Series for nested map array kevinzwang (3870)
- fix: Add metadata to subgraph options in python colin-ho (3869)
- fix: Update dashboard import raunakab (3865)

🚀 Performance

- perf: Clear task inputs upon dispatch colin-ho (3877)
- perf: Fix join cost estimates desmondcheongzx (3831)

**Full Changelog**: https://github.com/Eventual-Inc/Daft/compare/v0.4.5...v0.4.6

0.4.5

What's Changed 🚀

💥 Breaking Changes

- refactor!: split column expression into unresolved and resolved types kevinzwang (3804)

✨ Features

- feat(connect): `daft.pyspark` module universalmind303 (3861)
- feat: Emit children of join before shuffle + add stats to explain analyze colin-ho (3852)
- feat: Stageify plan on shuffle boundaries colin-ho (3781)
- feat(sql): adds session sql for leveraging attached catalogs rchowell (3860)
- feat(catalog): Cutover deprecated APIs to use session, catalog, table abstractions [3/3] rchowell (3830)
- feat(connect): read csv/parquet/json options universalmind303 (3791)
- feat(sql): select from multiple joins kevinzwang (3842)
- feat(catalog): Integrate session and catalog actions alongside existing APIs [2/3] rchowell (3825)
- feat(catalog): Prepare existing catalog APIs for integration [1/3] rchowell (3820)
- feat(sql): supports schemas in read\_json, read\_csv, read\_parquet rchowell (3836)
- feat(sql): supports array of paths in read\_ table-value functions rchowell (3835)
- feat: Add a daft dashboard to display queries plans and stats raunakab (3790)

🐛 Bug Fixes

- fix: sql round without precision universalmind303 (3863)
- fix: pypi publish workflow kevinzwang (3862)
- fix: build wheel Github action inputs kevinzwang (3858)
- fix: protocol in iceberg writes colin-ho (3851)
- fix: LogicalPlan::get\_schema\_for\_alias should stop when it hits any alias kevinzwang (3848)
- fix: Reduce number of nodes in random join graph test desmondcheongzx (3839)
- fix: Add excludes to broken link checker colin-ho (3834)
- fix: Grab Daft config from environment variables for new contexts desmondcheongzx (3832)
- fix: create series of np.datetime64['D'] rchowell (3829)

🚀 Performance

- perf(optimizer): Infer additional join graph edges during join reordering desmondcheongzx (3807)

♻️ Refactor

- refactor!: split column expression into unresolved and resolved types kevinzwang (3804)

📖 Documentation

- docs: respect daft analytics env var ccmao1130 (3856)
- docs: Update configuration docs to show `set_runner_native` colin-ho (3833)

🔧 Maintenance

- chore: replace anaconda with S3 for nightly build publish kevinzwang (3857)
- chore: minor cleanup to table-value functions rchowell (3854)
- chore: remove accidental printlins universalmind303 (3845)

**Full Changelog**: https://github.com/Eventual-Inc/Daft/compare/v0.4.4...v0.4.5

0.4.4

What's Changed 🚀

- build: update python-publish workflow ccmao1130 (3797)
- build(docs): fix docgen failed workflow ccmao1130 (3766)

✨ Features

- feat: Adds .summarize() to compute statistics rchowell (3810)
- feat(sql): SELECT without FROM rchowell (3814)
- feat: Simplify is ins to an OR chain of eqs colin-ho (3800)
- feat(session): Adds session class to python rchowell (3809)
- feat(session): Replaces direct usage of DaftCatalog with Session rchowell (3794)
- feat: Sequentially materialize left and right sides during hash join colin-ho (3735)
- feat(connect): add temporal functions universalmind303 (3799)
- feat: nulls first kernels universalmind303 (3789)
- feat(table): implement list\_unique and Set aggregation f4t4nt (3710)
- feat: add functions to daft-connect universalmind303 (3780)
- feat(catalog): Defines a session for connection state rchowell (3782)
- feat: implement bool\_and and bool\_or f4t4nt (3754)
- feat(catalog): Defines an identifier for use across catalogs rchowell (3763)
- feat(optimizer): Brute force join ordering desmondcheongzx (3688)
- feat(swordfish): Properly buffer unordered scan tasks colin-ho (3751)
- feat: better sql datatype support universalmind303 (3750)
- feat: Adds list constructor to Expression and SQL APIs rchowell (3737)
- feat: spark connect set operations universalmind303 (3739)
- feat: add spark explain universalmind303 (3741)

🐛 Bug Fixes

- fix: unity managed table reads pmogren (3806)
- fix: boolean casts to strings and null propagation rchowell (3770)
- fix: catalog table names universalmind303 (3760)

🚀 Performance

- perf(swordfish): Parallel expression evaluation colin-ho (3593)
- perf: Use parquet metadata from schema inference for accurate scan task statistics desmondcheongzx (3784)

♻️ Refactor

- refactor: rename `table` to `recordbatch` universalmind303 (3771)
- refactor: port DaftContext to rust side universalmind303 (3767)
- refactor: renames to\_struct to just struct rchowell (3755)

📖 Documentation

- docs: fix readthedocs build ccmao1130 (3824)
- docs: add scarf analytics ccmao1130 (3773)
- docs: Update distributed docs to add byoc mode, change name to daft cli jessie-young (3768)
- docs: update README.rst diagram ccmao1130 (3803)
- docs: update links in readme ccmao1130 (3779)
- docs: add footer and update broken links ccmao1130 (3764)

👷 CI

- ci: Allow TPCH benchmarks to use ARM cluster profile desmondcheongzx (3777)
- ci: Record info for TPCH benchmarks desmondcheongzx (3729)
- ci: send slack notification for broken links ccmao1130 (3742)

**Full Changelog**: https://github.com/Eventual-Inc/Daft/compare/v0.4.3...v0.4.4

0.4.3

What's Changed 🚀

- build(docs): Adds make docs phony target rchowell (3693)

✨ Features

- feat: Add a new dashboard UI to Daft raunakab (3738)
- feat(shuffles): Determination logic for pre shuffle merge colin-ho (3674)
- feat: Limit number of sources in merged scan task colin-ho (3695)
- feat: Expose parquet chunk size to swordfish reads colin-ho (3714)
- feat: add LiteralValue::Int8 and Int16 ugoa (3736)
- feat: with\_column(s)\_renamed expression for DataFrame jessie-young (3732)
- feat: Explain for swordfish colin-ho (3667)
- feat: Adds .describe() to DataFrame and DESCRIBE to SQL rchowell (3720)
- feat: Add column format option to iter rows colin-ho (3681)
- feat(core): make micropartition streamable over tables universalmind303 (3709)
- feat(iceberg): Adds support for read\_iceberg with metadata\_location to Daft-SQL rchowell (3701)
- feat: Overwrite partitions mode colin-ho (3687)
- feat(docs): Adds copy-to-clipboard to code samples rchowell (3702)
- feat(connect): sql universalmind303 (3696)
- feat: add binary string operations (length and concatenation) f4t4nt (3646)
- feat(sql): Adds url\_download and url\_upload to daft-sql rchowell (3690)
- feat(connect): distinct + sort universalmind303 (3677)
- feat(core): Implement null-safe equality operator f4t4nt (3663)
- feat(sql): Adds JsonScanBuilder to daft-scan and read\_json to daft-sql rchowell (3683)
- feat: support using S3Config.credentials\_provider for writes kevinzwang (3648)
- feat(sql): Adds FROM source check for string paths rchowell (3679)
- feat(connect): Rust ray exec universalmind303 (3666)

🐛 Bug Fixes

- fix: Set filter selectivity estimate lower bound colin-ho (3694)
- fix(join): joining on different types kevinzwang (3716)
- fix: to\_cnf and to\_dnf functions kevinzwang (3728)
- fix: pushdowns for unpivot universalmind303 (3724)
- fix(optimizer): Fix issues with join graph construction desmondcheongzx (3668)
- fix: Run filter null join key optimization once colin-ho (3657)

🚀 Performance

- perf: Track accumulated selectivity in logical plan to improve probe side decisions desmondcheongzx (3734)
- perf: simplify boolean expression rules kevinzwang (3731)
- perf(shuffles): Incrementally retrieve metadata in reduce colin-ho (3545)
- perf: Improve stats for join side determination colin-ho (3655)

♻️ Refactor

- refactor: remove eyre from daft-connect universalmind303 (3719)
- refactor(execution): NativeExecutor refactor universalmind303 (3689)
- refactor: logical op constructor+builder boundary kevinzwang (3684)
- refactor(connect): internal refactoring to make connect code more organized \& extensible universalmind303 (3680)

📖 Documentation

- docs: linked mkdocs \& api docs ccmao1130 (3703)
- docs: higher quality daft diagram for readme ccmao1130 (3697)
- docs: add daft launcher docs to docs v2 ccmao1130 (3678)

👷 CI

- ci: skip tests during publishing of release jaychia (3744)
- ci: Allow upstream git refs to be used for benchmarking desmondcheongzx (3730)
- ci: Remove daft tracing raunakab (3692)
- ci: Add new benchmarking cluster profile raunakab (3665)

🔧 Maintenance

- chore: Pin sql server version in docker compose colin-ho (3715)
- chore(connect): better error propagation \& handling universalmind303 (3675)
- chore(connect): consolidate multiple files in tests/connect universalmind303 (3676)

**Full Changelog**: https://github.com/Eventual-Inc/Daft/compare/v0.4.2...v0.4.3

Page 1 of 14

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.