This release contains significant upgrades to Developer API, as well as to Modin's documentation,
some refactor codebase and performance enhancements, and multiple bugfixes.
Key Features and Updates
------------------------
* Stability and Bugfixes
* FIX-https://github.com/modin-project/modin/issues/4058: Allow pickling empty dataframes and series (https://github.com/modin-project/modin/pull/4095)
* FIX-https://github.com/modin-project/modin/issues/4136: Fix exercise_3.ipynb example notebook (https://github.com/modin-project/modin/pull/4137)
* FIX-https://github.com/modin-project/modin/issues/4105: Fix names of pandas options to avoid `OptionError` (https://github.com/modin-project/modin/pull/4109)
* FIX-https://github.com/modin-project/modin/issues/3417: Fix read_csv with skiprows and header parameters (https://github.com/modin-project/modin/pull/3419)
* FIX-https://github.com/modin-project/modin/issues/4142: Fix OmniSci enabling (https://github.com/modin-project/modin/pull/4146)
* FIX-https://github.com/modin-project/modin/issues/4162: Use `skipif` instead of `skip` for compatibility with pytest 7.0 (https://github.com/modin-project/modin/pull/4163)
* FIX-https://github.com/modin-project/modin/issues/4158: Do not print OmniSci logs to stdout by default (https://github.com/modin-project/modin/pull/4159)
* FIX-https://github.com/modin-project/modin/issues/4177: Support read_feather from pathlike objects (https://github.com/modin-project/modin/issues/4177)
* FIX-https://github.com/modin-project/modin/issues/4234: Upgrade pandas to 1.4.1 (https://github.com/modin-project/modin/pull/4235)
* FIX-https://github.com/modin-project/modin/issues/3368: support unsigned integers in OmniSci backend (https://github.com/modin-project/modin/pull/4256)
* FIX-https://github.com/modin-project/modin/issues/4057: Allow reading an empty parquet file (https://github.com/modin-project/modin/pull/4075)
* FIX-https://github.com/modin-project/modin/issues/3884: Fix read_excel() dropping empty rows (https://github.com/modin-project/modin/pull/4161)
* FIX-https://github.com/modin-project/modin/issues/4257: Fix Categorical() for scalar categories (https://github.com/modin-project/modin/pull/4258)
* FIX-https://github.com/modin-project/modin/issues/4300: Fix Modin Categorical column dtype categories (https://github.com/modin-project/modin/pull/4276)
* FIX-https://github.com/modin-project/modin/issues/4208: Fix lazy metadata update for `PandasDataFrame.from_labels` (https://github.com/modin-project/modin/pull/4209)
* FIX-https://github.com/modin-project/modin/issues/3981, FIX-https://github.com/modin-project/modin/issues/3801, FIX-https://github.com/modin-project/modin/issues/4149: Stop broadcasting scalars to set items (https://github.com/modin-project/modin/pull/4160)
* FIX-https://github.com/modin-project/modin/issues/4185: Fix rolling across column partitions (https://github.com/modin-project/modin/pull/4262)
* FIX-https://github.com/modin-project/modin/issues/4303: Fix the syntax error in reading from postgres (https://github.com/modin-project/modin/pull/4304)
* FIX-https://github.com/modin-project/modin/issues/4308: Add proper error handling in df.set_index (https://github.com/modin-project/modin/pull/4309)
* FIX-https://github.com/modin-project/modin/issues/4056: Allow an empty parse_date list in `read_csv_glob` (https://github.com/modin-project/modin/pull/4074)
* FIX-https://github.com/modin-project/modin/issues/4312: Fix constructing categorical frame with duplicate column names (https://github.com/modin-project/modin/pull/4313).
* FIX-https://github.com/modin-project/modin/issues/4314: Allow passing a series of dtypes to astype (https://github.com/modin-project/modin/pull/4318)
* FIX-https://github.com/modin-project/modin/issues/4310: Handle lists of lists of ints in read_csv_glob (https://github.com/modin-project/modin/pull/4319)
* FIX-https://github.com/modin-project/modin/issues/4138, FIX-https://github.com/modin-project/modin/issues/4009: remove redundant sorting in the internal
* Performance enhancements
* FIX-https://github.com/modin-project/modin/issues/4138, FIX-https://github.com/modin-project/modin/issues/4009: remove redundant sorting in the internal '.mask()' flow (https://github.com/modin-project/modin/pull/4140)
* FIX-https://github.com/modin-project/modin/issues/4183: Stop shallow copies from creating global shared state. (https://github.com/modin-project/modin/pull/4184)
* Benchmarking enhancements
* FIX-https://github.com/modin-project/modin/issues/4221: add `wait` method for `PandasOnRayDataframeColumnPartition` class (https://github.com/modin-project/modin/pull/4231)
* Refactor Codebase
* REFACTOR-https://github.com/modin-project/modin/issues/3990: remove code duplication in `PandasDataframePartition` hierarchy (https://github.com/modin-project/modin/pull/3991)
* REFACTOR-https://github.com/modin-project/modin/issues/4229: remove unused `dask_client` global variable in `modin\pandas\__init__.py` (https://github.com/modin-project/modin/pull/4230)
* REFACTOR-https://github.com/modin-project/modin/issues/3997: remove code duplication for `broadcast_apply` method (https://github.com/modin-project/modin/pull/3996)
* REFACTOR-https://github.com/modin-project/modin/issues/3994: remove code duplication for `get_indices` function (https://github.com/modin-project/modin/pull/3995)
* REFACTOR-https://github.com/modin-project/modin/issues/4331: remove code duplication for `to_pandas`, `to_numpy` functions in `QueryCompiler` hierarchy (https://github.com/modin-project/modin/pull/4332)
* REFACTOR-https://github.com/modin-project/modin/issues/4213: Refactor `modin/examples/tutorial/` directory (https://github.com/modin-project/modin/pull/4214)
* REFACTOR-https://github.com/modin-project/modin/issues/4206: add assert check into `__init__` method of `PandasOnDaskDataframePartition` class (https://github.com/modin-project/modin/pull/4207)
* REFACTOR-https://github.com/modin-project/modin/issues/3900: add flake8-no-implicit-concat plugin and refactor flake8 error codes (https://github.com/modin-project/modin/pull/3901)
* REFACTOR-https://github.com/modin-project/modin/issues/4093: Refactor base to be smaller (https://github.com/modin-project/modin/pull/4220)
* REFACTOR-https://github.com/modin-project/modin/issues/4047: Rename `cluster` directory to `cloud` in examples (https://github.com/modin-project/modin/pull/4212)
* REFACTOR-https://github.com/modin-project/modin/issues/3853: interacting with Dask interface through `DaskWrapper` class (https://github.com/modin-project/modin/pull/3854)
* REFACTOR-https://github.com/modin-project/modin/issues/4322: Move is_reduce_fn outside of groupby_agg (https://github.com/modin-project/modin/pull/4323)
* Pandas API implementations and improvements
* FEAT-https://github.com/modin-project/modin/issues/3603: add experimental `read_custom_text` function that can read custom line-by-line text files (https://github.com/modin-project/modin/pull/3441)
* FEAT-https://github.com/modin-project/modin/issues/979: Enable reading from SQL server (https://github.com/modin-project/modin/pull/4279)
* Developer API enhancements
* FEAT-https://github.com/modin-project/modin/issues/4245: Define base interface for dataframe exchange protocol (https://github.com/modin-project/modin/pull/4246)
* FEAT-https://github.com/modin-project/modin/issues/4244: Implement dataframe exchange protocol for OmnisciOnNative execution (https://github.com/modin-project/modin/pull/4269)
* FEAT-https://github.com/modin-project/modin/issues/4144: Implement dataframe exchange protocol for pandas storage format (https://github.com/modin-project/modin/pull/4150)
* FEAT-https://github.com/modin-project/modin/issues/4342: Support `from_dataframe`` for pandas storage format (https://github.com/modin-project/modin/pull/4343)
* Update testing suite
* TEST-https://github.com/modin-project/modin/issues/3628: Report coverage data for `test-internals` CI job (https://github.com/modin-project/modin/pull/4198)
* TEST-https://github.com/modin-project/modin/issues/3938: Test tutorial notebooks in CI (https://github.com/modin-project/modin/pull/4145)
* TEST-https://github.com/modin-project/modin/issues/4153: Fix condition of running lint-commit and set of CI triggers (https://github.com/modin-project/modin/pull/4156)
* TEST-https://github.com/modin-project/modin/issues/4201: Add read_parquet, explode, tail, and various arithmetic functions to asv_bench (https://github.com/modin-project/modin/pull/4203)
* Documentation improvements
* DOCS-https://github.com/modin-project/modin/issues/4077: Add release notes template to docs folder (https://github.com/modin-project/modin/pull/4078)
* DOCS-https://github.com/modin-project/modin/issues/4082: Add pdf/epub/htmlzip formats for doc builds (https://github.com/modin-project/modin/pull/4083)
* DOCS-https://github.com/modin-project/modin/issues/4168: Fix rendering the examples on troubleshooting page (https://github.com/modin-project/modin/pull/4169)
* DOCS-https://github.com/modin-project/modin/issues/4151: Add info in troubleshooting page related to Dask engine usage (https://github.com/modin-project/modin/pull/4152)
* DOCS-https://github.com/modin-project/modin/issues/4172: Refresh Intel Distribution of Modin paragraph (https://github.com/modin-project/modin/pull/4175)
* DOCS-https://github.com/modin-project/modin/issues/4173: Mention strict channel priority in conda install section (https://github.com/modin-project/modin/pull/4178)
* DOCS-https://github.com/modin-project/modin/issues/4176: Update OmniSci usage section (https://github.com/modin-project/modin/pull/4192)
* DOCS-https://github.com/modin-project/modin/issues/4027: Add GIF images and chart to Modin README demonstrating speedups (https://github.com/modin-project/modin/pull/4232)
* DOCS-https://github.com/modin-project/modin/issues/3954: Add Dask example notebooks (https://github.com/modin-project/modin/pull/4139)
* DOCS-https://github.com/modin-project/modin/issues/4272: Add bar chart comparisons to quick start guide (https://github.com/modin-project/modin/pull/4277)
* DOCS-https://github.com/modin-project/modin/issues/3953: Add docs and notebook examples on running Modin with OmniSci (https://github.com/modin-project/modin/pull/4001)
* DOCS-https://github.com/modin-project/modin/issues/4280: Change links in jupyter notebooks (https://github.com/modin-project/modin/pull/4281)
* DOCS-https://github.com/modin-project/modin/issues/4290: Add changes for OmniSci notebooks (https://github.com/modin-project/modin/pull/4291)
* DOCS-https://github.com/modin-project/modin/issues/4241: Update warnings and docs regarding defaulting to pandas (https://github.com/modin-project/modin/pull/4242)
* DOCS-https://github.com/modin-project/modin/issues/3099: Fix `BasePandasDataSet` docstrings warnings (https://github.com/modin-project/modin/pull/4333)
* DOCS-https://github.com/modin-project/modin/issues/4339: Reformat I/O functions docstrings (https://github.com/modin-project/modin/pull/4341)
* DOCS-https://github.com/modin-project/modin/issues/4336: Reformat general utilities docstrings (https://github.com/modin-project/modin/pull/4338)
* Dependencies
* FIX-https://github.com/modin-project/modin/issues/4113, FIX-https://github.com/modin-project/modin/issues/4116, FIX-https://github.com/modin-project/modin/issues/4115: Apply new `black` formatting, fix pydocstyle check and readthedocs build (https://github.com/modin-project/modin/pull/4114)
* TEST-https://github.com/modin-project/modin/issues/3227: Use codecov github action instead of bash form in GA workflows (https://github.com/modin-project/modin/pull/3226)
* FIX-https://github.com/modin-project/modin/issues/4115: Unpin `pip` in readthedocs deps list (https://github.com/modin-project/modin/pull/4170)
* TEST-https://github.com/modin-project/modin/issues/4217: Pin `Dask<2022.2.0` as a temporary fix of CI (https://github.com/modin-project/modin/pull/4218)
Contributors
------------
prutskov, amyskov, paulovn, anmyachev, YarShev, RehanSD, devin-petersohn,
dchigarev, Garra1980, mvashishtha, naren-ponder, jeffreykennethli, dorisjlee, Rubtsowa