Modin 0.7 comes with the largest expansion of the API since the first release. Modin now supports over 83% of the pandas API, up from 71% last release. A number of long awaited features have been implemented to include: I/O support for Dask for parquet and other column stores and `groupby` with a list of column names.
Bugfixes + Pandas Concordance (π + πΌ)
* Allow merging of named Series (879)
* Correctly merge `CategoricalDtype` dtypes (889)
* Fix issue where certain arguments were not defaulting to pandas (890)
* Send full path to workers on `read_csv` (899)
* Remove `__array_prepare__` from Series API (900)
* Add Series.str.title to API (901)
* Fix `df.squeeze` when `axis=0` on a 1x1 dataframe (902)
* Fix `skiprows` logic for `read_csv` (918)
* read_sql() will default to pandas when chuksize is given (920)
* Fix DeprecationWarning: invalid escape sequence \d (950)
* Fix `apply` when `args` is set (953)
* Fix inplace updates on partitions (962)
* Fix bug where certain encodings were throwing an error (980)
* Fix inplace operations without inplace keyword on emtpy dataframes (983)
* Support console with __repr__ like pandas (984)
* Fix `count` when `numeric_only=False` (1002)
* Fix bug in `loc` where slice on columns only threw Exception (1024)
New Functionality β¨
* support for duplicated() and drop_duplicates() (892)
* Create SeriesGroupBy wrapper to default to pandas and return to Modin (908)
* Bring I/O support to Dask for everything supported (955) βοΈ
* Add support for grouping by multiple columns when doing a reduction (987) βοΈ
* Implement `DataFrame.at_time` and `Series.at_time` (991)
* Add implementation for `between_time` for `Series` and `DataFram⦠(992)
* Implement `combine` for Series and DataFrame (995)
* Add implementation for `combine_first` for `DataFrame` and `Seri⦠(996)
* Add implementation for `droplevel` for `Series` and `DataFrame` (1000)
* Implement `assign` for `DataFrame` (998)
* Add implementation for `first` for `Series` and `DataFrame` (1006)
* Add implementation for `last` for `DataFrame` and `Series` (1007)
* Add implementation for `swapaxes` for DataFrame and Series (1010)
* Add implementation for `tz_convert` for `Series` and `DataFrame` (1013)
* Implement `tz_localize` for `Series` and `DataFrame` (1014)
* Add implementation for `tshift` (1016)
* Add implementation: `swaplevel` for `Series` and `DataFrame` (1018)
* Add implementation: `reorder_levels` for DataFrame and Series (1022)
* Add implementation: `take` for Series and DataFrame (1020)
* Add implentation: `truncate` for Series and DataFrame (1026)
* Fix bug where Parsing error was thrown when text spanned multip⦠(1027)
Code Quality + Testing π―
* Update pytest and clean up tests a bit (903)
* CI updates (924, 925, 926, 927, 928, 929, 930, 931, 933, 936, 937, 938)
* Fix Windows Remove file test error (943)
* Add test script for simple execution of all unit tests (948)
* Fix pyarrow.parquet import in tests (952)
* Support parameters with run-tests.sh (959)
* Fix environment variables for CI and Master test suite (964)
* Make test_dataframe.py more granular with test suite (966)
* Cache pip depdendencies between builds to speed up process (967)
* Fix master CI workflow order and simplify workflow names (968)
* Update master build to allow coverage to be run (969)
* Optimize CI with minimal pip installs (971)
* use versioneer for versioning using VCS (1028)
Backend enhancements + Performance π
* Improve performance of setting a column from an existing one (942)
* Increase `n_workers` for Dask from default to number of cores (965)
Documentation π
* Update README to have a more accurate API coverage section (974)
* Move API Coverage section in README to a more appropriate place (975)
* Update README to add advanced usage (988)
Dependencies π
* Enforce only Python3+ on future releases (907)
* Change Coverage version to avoid sqlite3 errors (916)
* Remove top level import of `py` (935)
* Update Ray version to latest (941)
* Restructure import attempt to only try Ray if on a non-windows machine (945)
* Set `pure=False` for Dask `Client.submit` and `hash=False` for `β¦ (957)
Contributors this release
The following users contributed code to Modin since the last release.
ecoughlan (First time contributor) βοΈ
aeroaks (First time contributor) βοΈ
eavidan (Returning contributor) π
devin-petersohn (Maintainer)
ππ Thank you! ππ