Bugfixes 🐛
* eda: fix long name in missing heatmap (f6cc399e)
* connector: fix bug in url_path_params (c95a7ff1)
* eda: fix NA and int viz issue in plot_diff (ef36d5ac)
* eda: fix missing for SmallCard and DateTime type (201e487b)
* eda: fix create_report for dask csv (93e85673)
* clean: fix mixesd up formats of date in one column (e2956956)
* eda: fixed uncaught dtype and long var names (24f0295e)
* eda: fix correlation of num columns with small distinct values (9959b78a)
* eda: fix issue with dataframe of one column (910bb71a)
* eda: add geopoint in type count (94cbca23)
* eda: fixed uncaught dtype exceptions (d301eb75)
* eda: fix str transform with small distinct as categorical (65e7f907)
* eda: fix na values display issue (1ce5775e)
* eda: keep na when preprocess df (17d82191)
* clean: fix returned df_clean in clean_dupl (180e6ad2)
* clean: escape apostrophes in code exported by clean_dupl (e6ea7e97)
* eda: fixed endless loop and UI issues (69779cd6)
* eda: fix insight error (9ad4e26b)
* eda: suppress warnings for missing and report (df2a1e70)
* eda: fix insights of plot_correlation (f0ca5f41)
* eda: suppress warnings of progress bar and dask (ca8da4e1)
* eda.create_report: fix constant column error (160844ad)
* docs: fix docs of clean_df (38dd4b2a)
* clean: remove unneeded replace in clean_dupl (51c02cdd)
* eda: fixed bugs come with random generated datasets (53ecf76c)
* eda: fix bugs in log transformation (209d7d0c)
* eda: fixed and optimized css layouts (58e1b18f)
* clean: fix bug in validate_country (28068d46)
* eda: fix column name and index related issues (40a89b91)
* eda: variables can be none (325b0904)
* connector: path to new config repo (59603e5b)
* clean: lat_long regex not match a date format (49d3d227)
* eda.distribution: highlight variable names (998b1762)
* eda: fix the error of numerical cell in object column (91c4f9df)
* eda.distribution: box plot with object dtype (a37e9f21)
* clean: add comma after street suffix or name (e7655db9)
* clean: cast values as str in validate funcs (8e1b459a)
Features ✨
* clean: tuple of input formats for clean_country() (6bc65513)
* clean: add clean_text function (55d3ae95)
* eda: change color of geo map (1dbcddbf)
* clean: add clean_currency function (deb55938)
* clean: add clean_df() function (b750284f)
* type: detect column as categorical for small unique values (4696e598)
* eda: add geo_plot function (bbe64ec2)
* eda: create_report UI improvement (c849b013)
* eda: added new function plot_diff (79523c30)
* connector: allow parameters appear in url path (5adaf301)
* eda: value frequency table (bc37b794)
* eda: create_report UI improvement (72a0ca95)
* clean: add clean_duplication() function (98ff38d0)
* clean: support letters in clean_phone (25d163b3)
* eda: specify colors in plot(df), plot(df, x) (33fa36ea)
* connector: add functionality that lists supported websites (88187e18)
* clean: add clean_address function (e839ecd3)
* clean: add clean_headers function (40742a19)
* eda: parameter management and how-to guide (d2e8b10a)
* clean: add clean_date function (6aa6410e)
* create_report: add tabs for correlation and missing (6dc568b5)
Code Quality + Testing 💯
* eda: add test for geo point (943033a6)
* eda: add dataset test for report (0de5208b)
* eda: add test of random df (68239f03)
* clean: add tests for clean_duplication() (a4b9d32b)
* eda: add random data generator (e83f95b3)
* clean: add tests for clean_headers (0aca076e)
* eda: add test case of object column with numerical cell (57839841)
* clean) : add tests for clean_date and validate_date (812dbb8d)
Performance 🚀
* eda: optimize df preprocess and performance of create_report (e7eb182f)
* clean: update documentation of clean_date (c540fcc7)
* clean: improve performance of clean_duplication (8fda37e8)
* eda: use approximate nunique (60300644)
* clean: improve the peformace of clean_email() (176382bc)
* clean: improve performance of clean_date (854329ba)
Documentation 📃
* readme: update video, paper and titanic report for eda (1126dea8)
* eda: replace x, y, z with col1, col2, col3 (57f65b30)
* clean: add documentation for clean_text (65436b06)
* eda: add documentation for insights (1e4659be)
* clean: add documentation for clean_df() (4ecf0d71)
* eda: update user guide's datasets (2428f98e)
* eda: add documentation for geo plot (3558257c)
* clean: add user guide for clean_duplication (d834e857)
* clean: fix clean documentation (e3bed2ba)
* connector: revision (23085dd3)
* clean: add documentation for clean_date function (d445f36a)
* connector: add info docs (cb8cb5c5)
* connector: add config file section (f55226ea)
* connector: adding a process overview via DBLP section (5794d6c8)
* connector: remove stale rst files (433fdfe4)
* connector: convert pagination section from rst to ipynb (e4b9ba0c)
* connector: convert authorization section from rst to ipynb (d25af473)
* connector: change the pointer in index file from connector.rst to introduction.ipynb (218e41c6)
* connector: rewrite introduction and form doc structure (6a876937)
* connector: update API reference doc (9bed1694)
* clean: improve DataPrep.Clean ReadMe (a0bc96b0)
* eda: update legacy documentations for eda (8f948e05)
* clean: add documentation for clean_address (4061fca3)
* clean: add documentation for clean_headers (7a9d519c)
* clean: add links from user guide to api ref (182b5254)
* clean: Docstrings for phone and email (47f1e33d)
* datasets: add introduction for datasets (83d42cee)
* clean: add API reference (68182f6a)
* clean: add documentation for clean_ip function (9da3ed1e)
* connector: add query() section (c904d1fc)
* connector: add connect() section (bff842ed)
Contributors this release 🏆
The following users contributed code to DataPrep since the last release.
* andy \<insunshinelove.com\> (First time contributor) ⭐️
* AndyWangSFU \<zwa117sfu.ca\> (First time contributor) ⭐️
* atol \<alicelinlcgmail.com\>
* Brandon Lockhart \<brandon_lockhartsfu.ca\>
* dylanzxc \<zca92sfu.ca\>
* eutialia \<devebon.network\>
* Jinglin Peng \<jlpengcsgmail.com\>
* jinglinpeng \<jlpengcsgmail.com\>
* Lakshay-sethi \<58126894+Lakshay-sethiusers.noreply.github.com\> (First time contributor) ⭐️
* nzrymiak \<nzrymiaksfu.ca\>
* peiwangdb \<pennyiscomputinggmail.com\>
* peterirani \<peshotan_iranisfu.ca\> (First time contributor) ⭐️
* qidanrui \<qidanruigmail.com\> (First time contributor) ⭐️
* ryanwdale \<ryanwdalegmail.com\>
* waterpine \<songbianzju.edu.cn\>
* Weiyuan Wu \<youngwsfu.ca\>
* Yi Xie \<zjuxyeegmail.com\>
* yuzhenmao \<harrymao666gmail.com\>
* yuzhenmao \<57878927+yuzhenmaousers.noreply.github.com\>
* yxie66 \<zjuxyeegmail.com\>
* zhixuan_chi \<zca92sfu.ca\>
🎉🎉 Thank you! 🎉🎉