snowflake-ml-python Changelog

1.8.1

Bug Fixes

- Registry: Fix a bug that caused `unsupported model type` error while logging a sklearn model with `score_samples`
inference method.
- Registry: Fix a bug that model inference service creation fails on an existing and suspended service.

Behavior Change

New Features

- ML Job (PrPr): Update Container Runtime image version to `1.0.1`
- ML Job (PrPr): Add `enable_metrics` argument to job submission APIs to enable publishing service metrics to Event Table.
See [Accessing Event Table service metrics](https://docs.snowflake.com/en/developer-guide/snowpark-container-services/monitoring-services#accessing-event-table-service-metrics)
for retrieving published metrics
and [Costs of telemetry data collection](https://docs.snowflake.com/en/developer-guide/logging-tracing/logging-tracing-billing)
for cost implications.
- Registry: When creating a copy of a `ModelVersion` with `log_model`, raise an exception if unsupported arguments are provided.

1.8.0

Bug Fixes

- Modeling: Fix a bug in some metrics that allowed an unsupported version of numpy to be installed
automatically in the stored procedure, resulting in a numpy error on execution
- Registry: Fix a bug that leads to incorrect `Model is does not have _is_inference_api` error message when assigning
a supported model as a property of a CustomModel.
- Registry: Fix a bug that inference is not working when models with more than 500 input features
are deployed to SPCS.

Behavior Change

- Registry: With FeatureGroupSpec support, auto inferred model signature for `transformers.Pipeline` models have been
updated, including:
- Signature for fill-mask task has been changed from

python
ModelSignature(
inputs=[
FeatureSpec(name="inputs", dtype=DataType.STRING),
],
outputs=[
FeatureSpec(name="outputs", dtype=DataType.STRING),
],
)

to

python
ModelSignature(
inputs=[
FeatureSpec(name="inputs", dtype=DataType.STRING),
],
outputs=[
FeatureGroupSpec(
name="outputs",
specs=[
FeatureSpec(name="sequence", dtype=DataType.STRING),
FeatureSpec(name="score", dtype=DataType.DOUBLE),
FeatureSpec(name="token", dtype=DataType.INT64),
FeatureSpec(name="token_str", dtype=DataType.STRING),
],
shape=(-1,),
),
],
)

- Signature for token-classification task has been changed from

python
ModelSignature(
inputs=[
FeatureSpec(name="inputs", dtype=DataType.STRING),
],
outputs=[
FeatureSpec(name="outputs", dtype=DataType.STRING),
],
)

to

python
ModelSignature(
inputs=[FeatureSpec(name="inputs", dtype=DataType.STRING)],
outputs=[
FeatureGroupSpec(
name="outputs",
specs=[
FeatureSpec(name="word", dtype=DataType.STRING),
FeatureSpec(name="score", dtype=DataType.DOUBLE),
FeatureSpec(name="entity", dtype=DataType.STRING),
FeatureSpec(name="index", dtype=DataType.INT64),
FeatureSpec(name="start", dtype=DataType.INT64),
FeatureSpec(name="end", dtype=DataType.INT64),
],
shape=(-1,),
),
],
)

- Signature for question-answering task when top_k is larger than 1 has been changed from

python
ModelSignature(
inputs=[
FeatureSpec(name="question", dtype=DataType.STRING),
FeatureSpec(name="context", dtype=DataType.STRING),
],
outputs=[
FeatureSpec(name="outputs", dtype=DataType.STRING),
],
)

to

python
ModelSignature(
inputs=[
FeatureSpec(name="question", dtype=DataType.STRING),
FeatureSpec(name="context", dtype=DataType.STRING),
],
outputs=[
FeatureGroupSpec(
name="answers",
specs=[
FeatureSpec(name="score", dtype=DataType.DOUBLE),
FeatureSpec(name="start", dtype=DataType.INT64),
FeatureSpec(name="end", dtype=DataType.INT64),
FeatureSpec(name="answer", dtype=DataType.STRING),
],
shape=(-1,),
),
],
)

- Signature for text-classification task when top_k is `None` has been changed from

python
ModelSignature(
inputs=[
FeatureSpec(name="text", dtype=DataType.STRING),
FeatureSpec(name="text_pair", dtype=DataType.STRING),
],
outputs=[
FeatureSpec(name="label", dtype=DataType.STRING),
FeatureSpec(name="score", dtype=DataType.DOUBLE),
],
)

to

python
ModelSignature(
inputs=[
FeatureSpec(name="text", dtype=DataType.STRING),
],
outputs=[
FeatureSpec(name="label", dtype=DataType.STRING),
FeatureSpec(name="score", dtype=DataType.DOUBLE),
],
)

- Signature for text-classification task when top_k is not `None` has been changed from

python
ModelSignature(
inputs=[
FeatureSpec(name="text", dtype=DataType.STRING),
FeatureSpec(name="text_pair", dtype=DataType.STRING),
],
outputs=[
FeatureSpec(name="outputs", dtype=DataType.STRING),
],
)

to

python
ModelSignature(
inputs=[
FeatureSpec(name="text", dtype=DataType.STRING),
],
outputs=[
FeatureGroupSpec(
name="labels",
specs=[
FeatureSpec(name="label", dtype=DataType.STRING),
FeatureSpec(name="score", dtype=DataType.DOUBLE),
],
shape=(-1,),
),
],
)

- Signature for text-generation task has been changed from

python
ModelSignature(
inputs=[FeatureSpec(name="inputs", dtype=DataType.STRING)],
outputs=[
FeatureSpec(name="outputs", dtype=DataType.STRING),
],
)

to

python
ModelSignature(
inputs=[
FeatureGroupSpec(
name="inputs",
specs=[
FeatureSpec(name="role", dtype=DataType.STRING),
FeatureSpec(name="content", dtype=DataType.STRING),
],
shape=(-1,),
),
],
outputs=[
FeatureGroupSpec(
name="outputs",
specs=[
FeatureSpec(name="generated_text", dtype=DataType.STRING),
],
shape=(-1,),
)
],
)

- Registry: PyTorch and TensorFlow models now expect a single tensor input/output by default when logging to Model
Registry. To use multiple tensors (previous behavior), set `options={"multiple_inputs": True}`.

Example with single tensor input:

python
import torch

class TorchModel(torch.nn.Module):
def __init__(self, n_input: int, n_hidden: int, n_out: int, dtype: torch.dtype = torch.float32) -> None:
super().__init__()
self.model = torch.nn.Sequential(
torch.nn.Linear(n_input, n_hidden, dtype=dtype),
torch.nn.ReLU(),
torch.nn.Linear(n_hidden, n_out, dtype=dtype),
torch.nn.Sigmoid(),
)

def forward(self, tensor: torch.Tensor) -> torch.Tensor:
return cast(torch.Tensor, self.model(tensor))

Sample usage:
data_x = torch.rand(size=(batch_size, n_input))

Log model with single tensor
reg.log_model(
model=model,
...,
sample_input_data=data_x
)

Run inference with single tensor
mv.run(data_x)

For multiple tensor inputs/outputs, use:

python
reg.log_model(
model=model,
...,
sample_input_data=[data_x_1, data_x_2],
options={"multiple_inputs": True}
)

- Registry: Default `enable_explainability` to False when the model can be deployed to Snowpark Container Services.

New Features

- Registry: Added support to single `torch.Tensor`, `tensorflow.Tensor` and `tensorflow.Variable` as input or output
data.
- Registry: Support [`xgboost.DMatrix`](https://xgboost.readthedocs.io/en/stable/python/python_api.html#xgboost.DMatrix)
datatype for XGBoost models.

1.7.5

- Support Python 3.12.
- Explainability: Support native and snowflake.ml.modeling sklearn pipeline

Bug Fixes

- Registry: Fixed a compatibility issue when using `snowflake-ml-python` 1.7.0 or greater to save a `tensorflow.keras`
model with `keras` 2.x, if `relax_version` is set or default to True, and newer version of `snowflake-ml-python`
is available in Snowflake Anaconda Channel, model could not be run in Snowflake. If you have such model, you could
use the latest version of `snowflake-ml-python` and call `ModelVersion.load` to load it back, and re-log it.
Alternatively, you can prevent this issue by setting `relax_version=False` when saving the model.
- Registry: Removed the validation that disallows data that does not have non-null values being passed to
`ModelVersion.run`.
- ML Job (PrPr): No longer require CREATE STAGE privilege if `stage_name` points to an existing stage
- ML Job (PrPr): Fixed a bug causing some payload source and entrypoint path
combinations to be erroneously rejected with
`ValueError(f"{self.entrypoint} must be a subpath of {self.source}")`
- ML Job (PrPr): Fixed a bug in Ray cluster startup config which caused certain Runtime APIs to fail

New Features

- Registry: Added support for handling Hugging Face model configurations with auto-mapping functionality.
- Registry: Added support for `keras` 3.x model with `tensorflow` and `pytorch` backend
- ML Job (PrPr): Support any serializable (pickleable) argument for `remote` decorated functions

1.7.4

- FileSet: The `snowflake.ml.fileset.FileSet` has been deprecated and will be removed in a future version.
Use [snowflake.ml.dataset.Dataset](https://docs.snowflake.com/en/developer-guide/snowflake-ml/dataset) and
[snowflake.ml.data.DataConnector](https://docs.snowflake.com/en/developer-guide/snowpark-ml/reference/latest/api/data/snowflake.ml.data.data_connector.DataConnector)
instead.
- Registry: `ModelVersion.run` on a service would require redeploying the service once account opts into nested function.

Bug Fixes

- Registry: Fixed an issue that the hugging face pipeline is loaded using incorrect dtype.
- Registry: Fixed an issue that only 1 row is used when infer the model signature in the modeling model.

New Features

- Add new `snowflake.ml.jobs` preview API for running headless workloads on SPCS using
[Container Runtime for ML](https://docs.snowflake.com/en/developer-guide/snowflake-ml/container-runtime-ml)
- Added `guardrails` option to Cortex `complete` function, enabling
[Cortex Guard](https://docs.snowflake.com/en/user-guide/snowflake-cortex/llm-functions#cortex-guard) support
- Model Monitoring: Expose Model Monitoring Python API by default.

1.7.3

- Added lowercase versions of Cortex functions, added deprecation warning to Capitalized versions.
- Bumped the requirements of `fsspec` and `s3fs` to `>=2024.6.1,<2026`
- Bumped the requirement of `mlflow` to `>=2.16.0, <3`
- Registry: Support 500+ features for model registry
- Feature Store: Add support for `cluster_by` for feature views.

Bug Fixes

- Registry: Fixed a bug when providing non-range index pandas DataFrame as the input to a `ModelVersion.run`.
- Registry: Improved random model version name generation to prevent collisions.
- Registry: Fix an issue when inferring signature or running inference with Snowpark data that has a column whose type
is `ARRAY` and contains `NULL` value.
- Registry: `ModelVersion.run` now accepts fully qualified service name.
- Monitoring: Fix issue in SDK with creating monitors using fully qualified names.
- Registry: Fix error in log_model for any sklearn models with only data pre-processing including pre-processing only
pipeline models due to default explainability enablement.

New Features

- Added `user_files` argument to `Registry.log_model` for including images or any extra file with the model.
- Registry: Added support for handling Hugging Face model configurations with auto-mapping functionality
- DataConnector: Add new `DataConnector.from_sql()` constructor
- Registry: Provided new arguments to `snowflake.ml.model.model_signature.infer_signature` method to specify rows limit
to be used when inferring the signature.

1.7.2

Bug Fixes

- Model Explainability: Fix issue that explain is enabled for scikit-learn pipeline
whose task is UNKNOWN and fails later when invoked.

New Features

- Registry: Support asynchronous model inference service creation with the `block` option
in `ModelVersion.create_service()` set to True by default.
- Registry: Allow specify `batch_size` when inferencing using sentence-transformers model.

Snowflake-ml-python

Page 1 of 9