Huggingface-hub

Latest version: v0.23.4

Safety actively analyzes 640072 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 6 of 14

0.15.1

InferenceClient

We introduce [`InferenceClient`](https://huggingface.co/docs/huggingface_hub/main/en/package_reference/inference_client#huggingface_hub.InferenceClient), a new client to run inference on the Hub. The objective is to:
- support both [InferenceAPI](https://huggingface.co/docs/api-inference/index) and [Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index) services in a single client.
- offer a nice interface with:
- 1 method per task (e.g. `summary = client.summarization("this is a long text")`)
- 1 default model per task (i.e. easy to prototype)
- explicit and documented parameters
- convenient binary inputs (from url, path, file-like object,...)
- be flexible and support custom requests if needed

Check out the [Inference guide](https://huggingface.co/docs/huggingface_hub/main/en/guides/inference) to get a complete overview.

python
>>> from huggingface_hub import InferenceClient
>>> client = InferenceClient()

>>> image = client.text_to_image("An astronaut riding a horse on the moon.")
>>> image.save("astronaut.png")

>>> client.image_classification("https://upload.wikimedia.org/wikipedia/commons/thumb/4/43/Cute_dog.jpg/320px-Cute_dog.jpg")
[{'score': 0.9779096841812134, 'label': 'Blenheim spaniel'}, ...]


The short-term goal is to add support for more tasks (here is the [current list](https://huggingface.co/docs/huggingface_hub/main/en/guides/inference#supported-tasks)), especially text-generation and handle `asyncio` calls. The mid-term goal is to deprecate and replace `InferenceAPI`.

* Enhanced `InferenceClient` by Wauplin in 1474

Non-blocking uploads

It is now possible to run HfApi calls in the background! The goal is to make it easier to upload files periodically without blocking the main thread during a training. The was previously possible when using `Repository` but is now available for HTTP-based methods like `upload_file`, `upload_folder` and `create_commit`. If `run_as_future=True` is passed:
- the job is queued in a background thread. Only 1 worker is spawned to ensure no race condition. The goal is NOT to speed up a process by parallelizing concurrent calls to the Hub.
- a [`Future`](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Future) object is returned to check the job status
- main thread is not interrupted, even if an exception occurs during the upload

In addition to this parameter, a [run_as_future(...)](https://huggingface.co/docs/huggingface_hub/main/en/package_reference/hf_api#huggingface_hub.HfApi.run_as_future) method is available to queue any other calls to the Hub. More details [in this guide](https://huggingface.co/docs/huggingface_hub/main/en/guides/upload#nonblocking-upload).

py
>>> from huggingface_hub import HfApi

>>> api = HfApi()
>>> api.upload_file(...) takes Xs
URL to upload file

>>> future = api.upload_file(..., run_as_future=True) instant
>>> future.result() wait until complete
URL to upload file


* Run `HfApi` methods in the background (`run_as_future`) by Wauplin in 1458
* fix docs for run_as_future by Wauplin (direct commit on main)

Breaking changes

Some (announced) breaking changes have been introduced:
- `list_models`, `list_datasets` and `list_spaces` return an iterable instead of a list (lazy-loading of paginated results)
- The parameter `cardData` in `list_datasets` has been removed in favor of the parameter `full`.

Both changes had a deprecation cycle for a few releases now.

* Remove deprecated code + adapt tests by Wauplin in 1450

Bugfixes and small improvements

Token permission

New parameters in `login()` :
- `new_session` : skip login if new_session=False and user is already logged in
- `write_permission` : write permission is required (login fails otherwise)

Also added a new `HfApi().get_token_permission()` method that returns `"read"` or "`write"` (or `None` if not logged in).

* Add new_session, write_permission args by aliabid94 in 1476

List files with details

New parameter to get more details when listing files: `list_repo_files(..., expand=True)`.
API call is slower but `lastCommit` and `security` fields are returned as well.

* Add expand parameter to list_repo_files by Wauplin in 1451

Docs fixes

* Resolve broken link to 'filesystem' by tomaarsen in 1461
* Fix broken link in docs to hf_file_system guide by albertvillanova in 1469
* Remove hffs from docs by albertvillanova in 1468

Misc

* Fix consistency check when downloading a file by Wauplin in 1449
* Fix discussion URL on datasets and spaces by Wauplin in 1465
* FIX user agent not passed in snapshot_download by Wauplin in 1478
* Avoid `ImportError` when importing `WebhooksServer` and Gradio is not installed by mariosasko in 1482
* add utf8 encoding when opening files for windows by abidlabs in 1484
* Fix incorrect syntax in `_deprecation.py` warning message for `_deprecate_list_output()` by x11kjm in 1485
* Update _hf_folder.py by SimonKitSangChu in 1487
* fix pause_and_restart test by Wauplin (direct commit on main)
* Support image-to-image task in InferenceApi by Wauplin in 1489

0.14.1

Fixed an issue [reported in `diffusers`](https://github.com/huggingface/diffusers/issues/3213) impacting users downloading files from outside of the Hub. Expected download size now takes into account potential compression in the HTTP requests.

* Fix consistency check when downloading a file by Wauplin in https://github.com/huggingface/huggingface_hub/pull/1449


**Full Changelog**: https://github.com/huggingface/huggingface_hub/compare/v0.14.0...v0.14.1

0.14.0

HfFileSystem: interact with the Hub through the Filesystem API

We introduce [HfFileSystem](https://huggingface.co/docs/huggingface_hub/main/en/package_reference/hf_file_system#huggingface_hub.HfFileSystem), a pythonic filesystem interface compatible with [`fsspec`](https://filesystem-spec.readthedocs.io/en/latest/). Built on top of `HfApi`, it offers typical filesystem operations like `cp`, `mv`, `ls`, `du`, `glob`, `get_file` and `put_file`.

py
>>> from huggingface_hub import HfFileSystem
>>> fs = HfFileSystem()

List all files in a directory
>>> fs.ls("datasets/myself/my-dataset/data", detail=False)
['datasets/myself/my-dataset/data/train.csv', 'datasets/myself/my-dataset/data/test.csv']

>>> train_data = fs.read_text("datasets/myself/my-dataset/data/train.csv")


Its biggest advantage is to provide ready-to-use integrations with popular libraries like Pandas, DuckDB and Zarr.

py
import pandas as pd

Read a remote CSV file into a dataframe
df = pd.read_csv("hf://datasets/my-username/my-dataset-repo/train.csv")

Write a dataframe to a remote CSV file
df.to_csv("hf://datasets/my-username/my-dataset-repo/test.csv")


For a more detailed overview, please have a look to [this guide](https://huggingface.co/docs/huggingface_hub/main/en/guides/hf_file_system).


* Transfer the `hffs` code to `hfh` by mariosasko in 1420
* Hffs misc improvements by mariosasko in 1433

Webhook Server

`WebhooksServer` allows to implement, debug and deploy webhook endpoints on the Hub without any overhead. Creating a new endpoint is as easy as decorating a Python function.

python
app.py
from huggingface_hub import webhook_endpoint, WebhookPayload

webhook_endpoint
async def trigger_training(payload: WebhookPayload) -> None:
if payload.repo.type == "dataset" and payload.event.action == "update":
Trigger a training job if a dataset is updated
...


For more details, check out this [twitter thread](https://twitter.com/Wauplin/status/1646893678500392960) or the [documentation guide](https://huggingface.co/docs/huggingface_hub/main/en/guides/webhooks_server).

Note that this feature is experimental which means the API/behavior might change without prior notice. A warning is displayed to the user when using it. As it is experimental, we would love to get feedback!

* [Feat] Webhook server by Wauplin in 1410

Some upload QOL improvements

Faster upload with `hf_transfer`

Integration with a Rust-based library to upload large files in chunks and concurrently. Expect x3 speed-up if your bandwidth allows it!

* feat: add `hf_transfer` upload by McPatate in 1395

Upload in multiple commits

Uploading large folders at once might be annoying if any error happens while committing (e.g. a connection error occurs). It is now possible to upload a folder in multiple (smaller) commits. If a commit fails, you can re-run the script and resume the upload. Commits are pushed to a dedicated PR. Once completed, the PR is merged to the `main` branch resulting in a single commit in your git history.

py
upload_folder(
folder_path="local/checkpoints",
repo_id="username/my-dataset",
repo_type="dataset",
multi_commits=True, resumable multi-upload
multi_commits_verbose=True,
)


Note that this feature is also experimental, meaning its behavior might be updated in the future.

* New endpoint: `create_commits_on_pr` by Wauplin in 1375

Upload validation

Some more pre-validation done before committing files to the Hub. The `.git` folder is ignored in `upload_folder` (if any) + fail early in case of invalid paths.

* Fix `path_in_repo` validation when committing files by Wauplin in 1382
* Raise issue if trying to upload `.git/` folder + ignore `.git/` folder in `upload_folder` by Wauplin in 1408

Keep-alive connections between requests

Internal update to reuse the same HTTP session across `huggingface_hub`. The goal is to keep the connection open when doing multiple calls to the Hub which ultimately saves a lot of time. For instance, updating metadata in a README became 40% faster while listing all models from the Hub is 60% faster. This has no impact for atomic calls (e.g. 1 standalone GET call).

* Keep-alive connection between requests by Wauplin in 1394
* Accept backend_factory to configure Sessions by Wauplin in 1442

Custom sleep time for Spaces

It is now possible to programmatically set a custom sleep time on your upgraded Space. After X seconds of inactivity, your Space will go to sleep to save you some $$$.

py
from huggingface_hub import set_space_sleep_time

Put your Space to sleep after 1h of inactivity
set_space_sleep_time(repo_id=repo_id, sleep_time=3600)


* [Feat] Add `sleep_time` for Spaces by Wauplin in 1438

Breaking change

- `fsspec` has been added as a main dependency. It's a lightweight Python library required for `HfFileSystem`.

No other breaking change expected in this release.

Bugfixes & small improvements

File-related

A lot of effort has been invested in making `huggingface_hub`'s cache system more robust especially when working with symlinks on Windows. Hope everything's fixed by now.

* Fix relative symlinks in cache by Wauplin in 1390
* Hotfix - use relative symlinks whenever possible by Wauplin in 1399
* [hot-fix] Malicious repo can overwrite any file on disk by Wauplin in 1429
* Fix symlinks on different volumes on Windows by Wauplin in 1437
* [FIX] bug "Invalid cross-device link" error when using snapshot_download to local_dir with no symlink by thaiminhpv in 1439
* Raise after download if file size is not consistent by Wauplin in 1403

ETag-related

After a server-side configuration issue, we made `huggingface_hub` more robust when getting Hub's Etags to be more future-proof.

* Update file_download.py by Wauplin in 1406
* 🧹 Use `HUGGINGFACE_HEADER_X_LINKED_ETAG` const by julien-c in 1405
* Normalize both possible variants of the Etag to remove potentially invalid path elements by dwforbes in 1428

Documentation-related

* Docs about how to hide progress bars by Wauplin in 1416
* [docs] Update docstring for repo_id in push_to_hub by tomaarsen in 1436

Misc
* Prepare for 0.14 by Wauplin in 1381
* Add force_download to snapshot_download by Wauplin in 1391
* Model card template: Move model usage instructions out of Bias section by NimaBoscarino in 1400
* typo by Wauplin (direct commit on main)
* Log as warning when waiting for ongoing commands by Wauplin in 1415
* Fix: notebook_login() does not update UI on Databricks by fwetdb in 1414
* Passing the headers to hf_transfer download. by Narsil in 1444

Internal stuff
* Fix CI by Wauplin in 1392
* PR should not fail if codecov is bad by Wauplin (direct commit on main)
* remove cov check in PR by Wauplin (direct commit on main)
* Fix restart space test by Wauplin (direct commit on main)
* fix move repo test by Wauplin (direct commit on main)

0.13.4

Security patch to fix a vulnerability in `huggingface_hub`. In some cases, downloading a file with `hf_hub_download` or `snapshot_download` could lead to overwriting any file on a Windows machine. With this fix, only files in the cache directory (or a user-defined directory) can be updated/overwritten.

- Malicious repo can overwrite any file on disk 429 Wauplin

**Full Changelog**: https://github.com/huggingface/huggingface_hub/compare/v0.13.3...v0.13.4

0.13.3

Not secure
Patch to fix symlinks in the cache directory. Relative paths are used by default whenever possible. Absolute paths are used only on Windows when creating a symlink betweenh 2 paths that are not on the same volume. This hot-fix reverts the logic to what it was in `huggingface_hub<=0.12` given the issues that have being reported after the `0.13.2` release (https://github.com/huggingface/huggingface_hub/issues/1398, https://github.com/huggingface/diffusers/issues/2729 and https://github.com/huggingface/transformers/pull/22228)

Hotfix - use relative symlinks whenever possible https://github.com/huggingface/huggingface_hub/pull/1399 Wauplin

**Full Changelog**: https://github.com/huggingface/huggingface_hub/compare/v0.13.2...v0.13.3

0.13.2

Not secure
Patch to fix symlinks in the cache directory. All symlinks are now absolute paths.

* Fix relative symlinks in cache 1390 Wauplin

**Full Changelog**: https://github.com/huggingface/huggingface_hub/compare/v0.13.1...v0.13.2

Page 6 of 14

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.