Breaking Changes
We have changed the execution engine for derived features to Spark SQL so this might introduce a little bit breaking changes for users who is not running the up-to-date sample notebooks. Specifically, they might face this failure:
Preprocessed DataFrames are:
{'feature_user_age,feature_user_gift_card_balance,feature_user_has_valid_credit_card,feature_user_tax_rate': JavaObject id=o243}
Traceback (most recent call last):
File "feathr_pyspark_driver.py", line 107, in <module>
submit_spark_job(feature_names_funcs)
File "feathr_pyspark_driver.py", line 85, in submit_spark_job
py4j_feature_job.mainWithPreprocessedDataFrame(job_param_java_array, new_preprocessed_df_map)
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/py4j/java_gateway.py", line 1304, in __call__
return_value = get_return_value(
File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 117, in deco
pyspark.sql.utils.AnalysisException: Undefined function: 'toBoolean'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.; line 1 pos 84
)
Users should change:
python
feature_user_purchasing_power = DerivedFeature(name="feature_user_purchasing_power",
key=user_id,
feature_type=FLOAT,
input_features=[
feature_user_gift_card_balance, feature_user_has_valid_credit_card],
transform="feature_user_gift_card_balance + if_else(toBoolean(feature_user_has_valid_credit_card), 100, 0)")
to
python
feature_user_purchasing_power = DerivedFeature(name="feature_user_purchasing_power",
key=user_id,
feature_type=FLOAT,
input_features=[
feature_user_gift_card_balance, feature_user_has_valid_credit_card],
transform="feature_user_gift_card_balance + if(boolean(feature_user_has_valid_credit_card), 100, 0)")
What's Changed
* Fix a feature type bug by jaymo001 in https://github.com/feathr-ai/feathr/pull/701
* Fix wheel building problem in Windows by xiaoyongzhu in https://github.com/feathr-ai/feathr/pull/702
* Fix Purview+RBAC registry web app issue by Yuqing-cat in https://github.com/feathr-ai/feathr/pull/700
* Remove hard coded resources in docs by enya-yx in https://github.com/feathr-ai/feathr/pull/696
* Add e2e test for purview registry and rbac registry by blrchen in https://github.com/feathr-ai/feathr/pull/689
* Update tests use runtime jar from maven for spark submission to cover Databricks by blrchen in https://github.com/feathr-ai/feathr/pull/706
* Enhance databricks submission error message by enya-yx in https://github.com/feathr-ai/feathr/pull/710
* Enhance purview registry error messages by blrchen in https://github.com/feathr-ai/feathr/pull/709
* [WIP] hot fix databricks es dependency issue by Yuqing-cat in https://github.com/feathr-ai/feathr/pull/713
* Fix materialize to sql e2e test failure by blrchen in https://github.com/feathr-ai/feathr/pull/717
* Add Data Models in Feathr by hyingyang-linkedin in https://github.com/feathr-ai/feathr/pull/659
* Revert "Enhance purview registry error messages (709)" by blrchen in https://github.com/feathr-ai/feathr/pull/720
* Improve Avro GenericRecord and SpecificRecord based row-level extractor performance by jaymo001 in https://github.com/feathr-ai/feathr/pull/723
* Fix lookup feature missing issue when converting feature definition to HOCON files by jaymo001 in https://github.com/feathr-ai/feathr/pull/732
* Fix function string parsing by loomlike in https://github.com/feathr-ai/feathr/pull/725
* Apply a same credential within each sample [ Docs ] by enya-yx in https://github.com/feathr-ai/feathr/pull/718
* Enable incremental for HDFS sink by enya-yx in https://github.com/feathr-ai/feathr/pull/695
* 492 fix, fail only if different sources have same name by windoze in https://github.com/feathr-ai/feathr/pull/733
* Remove unused credentials and deprecated purview settings by enya-yx in https://github.com/feathr-ai/feathr/pull/708
* Revoke adb token submitted by mistaken by blrchen in https://github.com/feathr-ai/feathr/pull/730
* Fix synapse errors not print out issue by enya-yx in https://github.com/feathr-ai/feathr/pull/734
* Spark config passing bug fix for local spark submission by loomlike in https://github.com/feathr-ai/feathr/pull/729
* Fix direct purview client missing transformation by YihuiGuo in https://github.com/feathr-ai/feathr/pull/736
* Support SQL expression in derived feature transformation by jaymo001 in https://github.com/feathr-ai/feathr/pull/731
* Support SWA with groupBy to 1d tensor conversion by jaymo001 in https://github.com/feathr-ai/feathr/pull/748
* Rijai/armfix by jainr in https://github.com/feathr-ai/feathr/pull/742
* bump version to 0.8.2 by Yuqing-cat in https://github.com/feathr-ai/feathr/pull/722
* Added latest deltalake version by ahlag in https://github.com/feathr-ai/feathr/pull/735
* Fix 474 Disable local mode by windoze in https://github.com/feathr-ai/feathr/pull/738
* Allow recreating entities for PurView registry by windoze in https://github.com/feathr-ai/feathr/pull/691
* Adding DevSkim linter to Github actions by jainr in https://github.com/feathr-ai/feathr/pull/657
* Fix icons in UI cannot auto scale (737) by Fendoe in https://github.com/feathr-ai/feathr/pull/744
* Expose 'timePartitionPattern' in Python API [ WIP ] by enya-yx in https://github.com/feathr-ai/feathr/pull/714
* Setting up component governance pipeline by jainr in https://github.com/feathr-ai/feathr/pull/655
* Add docs to explain on feature materialization behavior by xiaoyongzhu in https://github.com/feathr-ai/feathr/pull/688
* Fix protobuf version by enya-yx in https://github.com/feathr-ai/feathr/pull/711
* Add some notes based on on-call issues by enya-yx in https://github.com/feathr-ai/feathr/pull/753
* Refine spark runtime error message by Yuqing-cat in https://github.com/feathr-ai/feathr/pull/755
* Serialization bug due to version incompatibility between azure-core and msrest by jainr in https://github.com/feathr-ai/feathr/pull/763
* Unify Python SDK Build Version and decouple Feathr Maven Version by Yuqing-cat in https://github.com/feathr-ai/feathr/pull/746
* Replace hard code string in notebook and align with others by Yuqing-cat in https://github.com/feathr-ai/feathr/pull/765
* Add flag to enable generation non-agg features by windoze in https://github.com/feathr-ai/feathr/pull/719
* roll back 0.8.2 version bump by Yuqing-cat in https://github.com/feathr-ai/feathr/pull/771
* Refactor Product Recommendation sample notebook by jainr in https://github.com/feathr-ai/feathr/pull/743
* Update role-management page in UI (751) by Fendoe in https://github.com/feathr-ai/feathr/pull/764
* Create Feature less module in UI code and import alias by Fendoe in https://github.com/feathr-ai/feathr/pull/768
* Add extra dependencies to setup.py by loomlike in https://github.com/feathr-ai/feathr/pull/773
* Fix Windows compatibility issues by xiaoyongzhu in https://github.com/feathr-ai/feathr/pull/776
* UI: Replace logo icon by Fendoe in https://github.com/feathr-ai/feathr/pull/778
* Refine example notebooks by loomlike in https://github.com/feathr-ai/feathr/pull/756
* UI: Display version by Fendoe in https://github.com/feathr-ai/feathr/pull/779
* Add nightly Notification to PR Test GitHub Action by Yuqing-cat in https://github.com/feathr-ai/feathr/pull/783
* Fix broken links for 743 by Yuqing-cat in https://github.com/feathr-ai/feathr/pull/789
* Update notebook image links for github rendering by loomlike in https://github.com/feathr-ai/feathr/pull/787
* Revert 756 by blrchen in https://github.com/feathr-ai/feathr/pull/798
* remove unnecessary spark job from registry test by Yuqing-cat in https://github.com/feathr-ai/feathr/pull/790
* Revert "Expose 'timePartitionPattern' in Python API [ WIP ]" by blrchen in https://github.com/feathr-ai/feathr/pull/799
* Update CONTRIBUTING.md with committers information by hangfei in https://github.com/feathr-ai/feathr/pull/793
* Fix test_azure_spark_maven_e2e ci test error by blrchen in https://github.com/feathr-ai/feathr/pull/800
* Add failure warning and run link to daily notification by Yuqing-cat in https://github.com/feathr-ai/feathr/pull/802
* Minor documentation update to add info about maven automated workflow by jainr in https://github.com/feathr-ai/feathr/pull/795
* Fix doc dead links by blrchen in https://github.com/feathr-ai/feathr/pull/805
* Fix more dead links on docs by blrchen in https://github.com/feathr-ai/feathr/pull/807
* Improve UI experience and clean up ui code warnings by Fendoe in https://github.com/feathr-ai/feathr/pull/801
* Add release instructions for Release Candidate by blrchen in https://github.com/feathr-ai/feathr/pull/809
* Bump version to 0.9.0-rc1 by blrchen in https://github.com/feathr-ai/feathr/pull/810
* Fix bug in empty array dense tensor default value by bozhonghu in https://github.com/feathr-ai/feathr/pull/806
* Fix sql-based derived feature by jaymo001 in https://github.com/feathr-ai/feathr/pull/812
* Replacing webapp-deploy action with workflow-webhook action. by jainr in https://github.com/feathr-ai/feathr/pull/813
* Fix passthrough feature reference in sql-based derived feature by jaymo001 in https://github.com/feathr-ai/feathr/pull/815
* Revert databricks example notebook until fixing issues by loomlike in https://github.com/feathr-ai/feathr/pull/814
* Add retry logic for purview project-ids logic by Yuqing-cat in https://github.com/feathr-ai/feathr/pull/821
* Bump version to 0.9.0-rc2 by blrchen in https://github.com/feathr-ai/feathr/pull/822
* Fix Not display management menu by Fendoe in https://github.com/feathr-ai/feathr/pull/826
* Update text and link by Fendoe in https://github.com/feathr-ai/feathr/pull/828
* fix sample issues due to derived feature engine change by xiaoyongzhu in https://github.com/feathr-ai/feathr/pull/829
* Add exception if materialize features defined on 'INPUT_CONTEXT' by enya-yx in https://github.com/feathr-ai/feathr/pull/785
* Fix only first Key will show even if multiple keys are added by Fendoe in https://github.com/feathr-ai/feathr/pull/837
* Move the version information to the bottom of the sidemenu. by Fendoe in https://github.com/feathr-ai/feathr/pull/832
* Fix key cannot read properties of undefined (reading 'map') by Fendoe in https://github.com/feathr-ai/feathr/pull/841
* Model by hyingyang-linkedin in https://github.com/feathr-ai/feathr/pull/769
* Bump loader-utils from 2.0.2 to 2.0.3 in /ui by dependabot in https://github.com/feathr-ai/feathr/pull/846
* Maven Package Version Configuration Fix by Yuqing-cat in https://github.com/feathr-ai/feathr/pull/845
* Copy/paste typo by windoze in https://github.com/feathr-ai/feathr/pull/849
* Update outdated docs (WASB_ to BLOB_) by loomlike in https://github.com/feathr-ai/feathr/pull/850
* Update registry nightly deploy CICD by blrchen in https://github.com/feathr-ai/feathr/pull/853
* Windoze/purview registry error log by windoze in https://github.com/feathr-ai/feathr/pull/851
* Fix duplicate action id in registry CICD by blrchen in https://github.com/feathr-ai/feathr/pull/854
* Improve Feathr Client initialization logs by blrchen in https://github.com/feathr-ai/feathr/pull/856
* Enhance error messages of synapse jobs by enya-yx in https://github.com/feathr-ai/feathr/pull/855
* Fix avro files read failure under timePartitionPattern paths by enya-yx in https://github.com/feathr-ai/feathr/pull/808
* Bump version to 0.9.0-rc3 by blrchen in https://github.com/feathr-ai/feathr/pull/860
* Enhance sample notebook by enya-yx in https://github.com/feathr-ai/feathr/pull/848
New Contributors
* hyingyang-linkedin made their first contribution in https://github.com/feathr-ai/feathr/pull/659
* loomlike made their first contribution in https://github.com/feathr-ai/feathr/pull/725
* Fendoe made their first contribution in https://github.com/feathr-ai/feathr/pull/744
* bozhonghu made their first contribution in https://github.com/feathr-ai/feathr/pull/806
**Full Changelog**: https://github.com/feathr-ai/feathr/compare/v0.8.0...v0.9.0