Featureform

Latest version: v1.15.7

Safety actively analyzes 715032 Python packages for vulnerabilities to keep your Python projects secure.

Page 2 of 6

0.9.0

What's New
Vector Database and Embedding Support
You can use Featureform to define and orchestrate data pipelines that generate embeddings. Featureform can write them into either Redis for nearest neighbor lookup. This also allows users to version, re-use, and manage embeddings declaratively.

Registering Redis for use as a Vector Store (it’s the same as registering it typically)

ff.register_redis(
name = "redis",
description = "Example inference store",
team = "Featureform",
host = "0.0.0.0",
port = 6379,
)

A Pipeline to Generate Embeddings from Text

docs = spark.register_file(...)

spark.df_transform(
inputs=[docs],
)
def embed_docs():
docs[“embedding”] = docs[“text”].map(lambda txt: openai.Embedding.create(
model="text-embedding-ada-002",
input=txt,
)["data"]
return docs

Defining and Versioning an Embedding

ff.entity
def Article:
embedding = ff.Embedding(embed_docs[[“id”, “embedding”]], dims=1024, vector_db=redis)

ff.entity
class Article:
embedding = ff.Embedding(
embed_docs[["id", "embedding"]],
dims=1024,
variant="test-variant",
vector_db=redis,
)

Performing a Nearest Neighbor Lookup

client.Nearest(Article.embedding, “id_123”, 25)

Interact with Training Sets as Dataframes
You can already interact with sources as dataframes, this release adds the same functionality to training sets as well.

Interacting with a training set as Pandas

import featureform as ff

client = ff.Client(...)
df = client.training_set(“fraud”, “simple”).dataframe()
print(df.head())

Enhanced Scheduling across Offline Stores
Featureform supports Cron syntax for scheduling transformations to run. This release rebuffs this functionality to make it more stable and efficient, and also adds more verbose error messages.

A transformation that runs every hour on Snowflake

snowflake.sql_transform(schedule=“0 * * * *”)
def avg_transaction_price()
return “SELECT user, AVG(price) FROM {{transaction}} GROUP BY user”

Run Pandas Transformations on K8s with S3
Featureform schedules and runs your transformations for you. We support running Pandas directly, Featureform spins up a Kubernetes job to run it. This isn’t a replacement for distributed processing frameworks like Spark (which we also support), but it’s a great option for teams that are already using Pandas for production.

Defining our Pandas on Kubernetes Provider

aws_creds = ff.AWSCredentials(
aws_access_key_id="<aws_access_key_id>",
aws_secret_access_key="<aws_secret_access_key>",
)

s3 = ff.register_s3(
name="s3",
credentials=aws_creds,
bucket_path="<s3_bucket_path>",
bucket_region="<s3_bucket_region>"
)

pandas_k8s = ff.register_k8s(
name="k8s",
description="Native featureform kubernetes compute",
store=s3,
team="featureform-team"
)

Registering a file in S3 and a Transformation on it

src = pandas_k8s.register_file(...)

pandas_k8s.df_transform(inputs=[src])
def transform(src):
return src.groupby("CustomerID")["TransactionAmount"].mean()

0.8.1

What's Changed
New Functionality
* KCF/S3 Support by ahmadnazeri in https://github.com/featureform/featureform/pull/786

Enhancements
* Updated Readme example to fix serving and use class api by ahmadnazeri in https://github.com/featureform/featureform/pull/792
* Dashboard Routing and Build Optimizations by RedLeader16 in https://github.com/featureform/featureform/pull/781
* Set Jobs Limit for scheduling by aolfat in https://github.com/featureform/featureform/pull/794
* Reformat and cleanup status displayer by aolfat in https://github.com/featureform/featureform/pull/782
* Bump pymdown-extensions from 9.9.2 to 10.0 by dependabot in https://github.com/featureform/featureform/pull/804

Bug Fixes
* Bad pathing exception 769 by RedLeader16 in https://github.com/featureform/featureform/pull/773
* Throw error if input tuple is not of type (str, str) by ahmadnazeri in https://github.com/featureform/featureform/pull/780
* Fix issue with paths for the Spark files by ahmadnazeri in https://github.com/featureform/featureform/pull/776
* Fix missing executor type in differing fields check for SparkProvider by zhilingc in https://github.com/featureform/featureform/pull/789
* Add default username and password for etcd coordinator by aolfat in https://github.com/featureform/featureform/pull/798

New Contributors
* zhilingc made their first contribution in https://github.com/featureform/featureform/pull/789

**Full Changelog**: https://github.com/featureform/featureform/compare/v0.8.0...v0.8.1

0.8.0

What's Changed
* Spark Enhancement: Yarn Support
* Pull source and transformation data to client

client = Client() presumes $FEATUREFORM_HOST is set
client.apply(insecure=False) `insecure=True` for Docker (Quickstart only)

Primary source as a dataframe
transactions_df = client.dataframe(
transactions, limit=2
) Using the ColumnSourceRegistrar instance directly with a limit of 2 rows

SQL transformation source as dataframe
avg_user_transaction_df = client.dataframe(
"average_user_transaction", "quickstart"
) Using the source name and variant without a limit, which fetches all rows

print(transactions_df.head())

"""
"transactionid" "customerid" "customerdob" "custlocation" "custaccountbalance" "transactionamount" "timestamp" "isfraud"
0 T1 C5841053 10/1/94 JAMSHEDPUR 17819.05 25.0 2022-04-09T11:33:09Z False
1 T2 C2142763 4/4/57 JHAJJAR 2270.69 27999.0 2022-03-27T01:04:21Z False
"""

* Added Ecommerce notebooks for Azure, AWS, GCP
* Docs: Updated custom resource docs and added docs for KCF
* Bugfix: Updated, more useful error messages
* Bugfix: Fixed resource search
* Bugfix: Fixed breadcrumb type and case error
* Bugfix: KCF Resource limits
* Bugfix: Fixed path for docker file and spark
* Bugfix: Dashboard routing and reload fix
* Bugfix: Spark databricks error message

**Full Changelog**: https://github.com/featureform/featureform/compare/v0.7.3...v0.8.0

0.7.3

What's Changed
* Class API Enhancement: Optional `timestamp_column` when registering features/labels
* Docs: Update AWS Deployment to Cover changes to `eksctl`
* Python 3.11.2 Support
* Bugfix: Resource Status in CLI List Command
* Bugfix: Fixing spark issue with Spark chained transformation
* Bugfix: Issue with not allowing Python objects as input to DF Transformation
* Bugfix: Checks existence of training set features prior to creation
* Adding notebook links to docs

New Contributors
* jmeisele made their first contribution in https://github.com/featureform/featureform/pull/727

**Full Changelog**: https://github.com/featureform/featureform/compare/v0.7.2...v0.7.3

0.7.2

What's Changed

- Misc QOL improvements for the client

**Full Changelog**: https://github.com/featureform/featureform/compare/v0.7.1...v0.7.2

0.7.1

What's Changed
- Bugfix for On demand feature status
**Full Changelog**: https://github.com/featureform/featureform/compare/v0.7.0...v0.7.1

Page 2 of 6

Releases

Has known vulnerabilities

Previous Next

Featureform

Page 2 of 6

0.9.0

0.8.1

0.8.0

0.7.3

0.7.2

0.7.1

Page 2 of 6

Links

Releases