Transformers-js-py

Latest version: v0.19.10

Safety actively analyzes 724020 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 16

3.3.2

What's new?
* Add support for Helium and Glm in https://github.com/huggingface/transformers.js/pull/1156
* Improve build process and fix usage with certain bundlers in https://github.com/huggingface/transformers.js/pull/1158
* Auto-detect wordpiece tokenizer when model.type is missing in https://github.com/huggingface/transformers.js/pull/1151
* Update Moonshine config values for transformers v4.48.0 in https://github.com/huggingface/transformers.js/pull/1155
* Support simultaneous tensor op execution in WASM in https://github.com/huggingface/transformers.js/pull/1162
* Update react tutorial sample code in https://github.com/huggingface/transformers.js/pull/1152

**Full Changelog**: https://github.com/huggingface/transformers.js/compare/3.3.1...3.3.2

3.3.1

What's new?
* hotfix: Copy missing ort-wasm-simd-threaded.jsep.mjs to dist folder (https://github.com/huggingface/transformers.js/pull/1150)

**Full Changelog**: https://github.com/huggingface/transformers.js/compare/3.3.0...3.3.1

3.3.0

🔥 Transformers.js v3.3 — StyleTTS 2 (Kokoro) for state-of-the-art text-to-speech, Grounding DINO for zero-shot object detection

- [🤖 New models: StyleTTS 2, Grounding Dino](new-models)
- [**StyleTTS 2**: High-quality speech synthesis](style_text_to_speech_2)
- [**Grounding DINO**: Zero-shot object detection](grounding-dino)
- [🛠️ Other improvements](other-improvements)
- [🤗 New contributors](new-contributors)


<h2 id="new-models">🤖 New models: StyleTTS 2, Grounding DINO</h2>

<h3 id="style_text_to_speech_2">StyleTTS 2 for high-quality speech synthesis</h3>

See https://github.com/huggingface/transformers.js/pull/1148 for more information and [here](https://huggingface.co/models?other=style_text_to_speech_2&library=transformers.js) for the list of supported models.

First, install the `kokoro-js` library, which uses Transformers.js, from [NPM](https://npmjs.com/package/kokoro-js) using:
bash
npm i kokoro-js


You can then generate speech as follows:

js
import { KokoroTTS } from "kokoro-js";

const model_id = "onnx-community/Kokoro-82M-ONNX";
const tts = await KokoroTTS.from_pretrained(model_id, {
dtype: "q8", // Options: "fp32", "fp16", "q8", "q4", "q4f16"
});

const text = "Life is like a box of chocolates. You never know what you're gonna get.";
const audio = await tts.generate(text, {
// Use `tts.list_voices()` to list all available voices
voice: "af_bella",
});
audio.save("audio.wav");


<h3 id="grounding-dino">Grounding DINO for zero-shot object detection</h3>

See https://github.com/huggingface/transformers.js/pull/1137 for more information and [here](https://huggingface.co/models?other=grounding-dino&library=transformers.js) for the list of supported models.

**Example:** Zero-shot object detection with `onnx-community/grounding-dino-tiny-ONNX` using the `pipeline` API.
js
import { pipeline } from "huggingface/transformers";

const detector = await pipeline("zero-shot-object-detection", "onnx-community/grounding-dino-tiny-ONNX");

const url = "http://images.cocodataset.org/val2017/000000039769.jpg";
const candidate_labels = ["a cat."];
const output = await detector(url, candidate_labels, {
threshold: 0.3,
});



<details>

<summary>See example output</summary>


[
{ score: 0.45316222310066223, label: "a cat", box: { xmin: 343, ymin: 23, xmax: 637, ymax: 372 } },
{ score: 0.36190420389175415, label: "a cat", box: { xmin: 12, ymin: 52, xmax: 317, ymax: 472 } },
]


</details>


<h2 id="other-improvements">🛠️ Other improvements</h2>

* Add the RawAudio class by Th3G33k in https://github.com/huggingface/transformers.js/pull/682
* Update React guide for v3 by sroussey in https://github.com/huggingface/transformers.js/pull/1128
* Add option to skip special tokens in TextStreamer by sroussey in https://github.com/huggingface/transformers.js/pull/1139


<h2 id="new-contributors">🤗 New contributors</h2>

* sroussey made their first contribution in https://github.com/huggingface/transformers.js/pull/1128


**Full Changelog**: https://github.com/huggingface/transformers.js/compare/3.2.4...3.3.0

3.2.4

What's new?
* Add support for visualizing self-attention heatmaps in https://github.com/huggingface/transformers.js/pull/1117

<table>
<tr>
<td rowspan="2">
<img src="https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/cats.jpg" alt="Cats" width="200">
</td>
<td>
<img src="https://github.com/user-attachments/assets/928c3d97-2c67-4ddb-9e9c-2a06745a532f" alt="Attention Head 0" width="200">
</td>
<td>
<img src="https://github.com/user-attachments/assets/e7725424-10fd-4a47-8350-8f367d21657d" alt="Attention Head 1" width="200">
</td>
<td>
<img src="https://github.com/user-attachments/assets/81790060-f4bf-4e5c-8d35-a9246acb9a36" alt="Attention Head 2" width="200">
</td>
</tr>
<tr>
<td>
<img src="https://github.com/user-attachments/assets/ebe44550-8a40-4e17-84eb-75fe6fce5df5" alt="Attention Head 3" width="200">
</td>
<td>
<img src="https://github.com/user-attachments/assets/32439d8d-7798-40e2-a4aa-d0e109afe1b5" alt="Attention Head 4" width="200">
</td>
<td>
<img src="https://github.com/user-attachments/assets/2faff471-fba1-4456-8332-e66a4a05bc5d" alt="Attention Head 5" width="200">
</td>
</tr>
</table>


<details>

<summary>Example code</summary>

js
import { AutoProcessor, AutoModelForImageClassification, interpolate_4d, RawImage } from "huggingface/transformers";

// Load model and processor
const model_id = "onnx-community/dinov2-with-registers-small-with-attentions";
const model = await AutoModelForImageClassification.from_pretrained(model_id);
const processor = await AutoProcessor.from_pretrained(model_id);

// Load image from URL
const image = await RawImage.read("https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/cats.jpg");

// Pre-process image
const inputs = await processor(image);

// Perform inference
const { logits, attentions } = await model(inputs);

// Get the predicted class
const cls = logits[0].argmax().item();
const label = model.config.id2label[cls];
console.log(`Predicted class: ${label}`);

// Set config values
const patch_size = model.config.patch_size;
const [width, height] = inputs.pixel_values.dims.slice(-2);
const w_featmap = Math.floor(width / patch_size);
const h_featmap = Math.floor(height / patch_size);
const num_heads = model.config.num_attention_heads;
const num_cls_tokens = 1;
const num_register_tokens = model.config.num_register_tokens ?? 0;

// Visualize attention maps
const selected_attentions = attentions
.at(-1) // we are only interested in the attention maps of the last layer
.slice(0, null, 0, [num_cls_tokens + num_register_tokens, null])
.view(num_heads, 1, w_featmap, h_featmap);

const upscaled = await interpolate_4d(selected_attentions, {
size: [width, height],
mode: "nearest",
});

for (let i = 0; i < num_heads; ++i) {
const head_attentions = upscaled[i];
const minval = head_attentions.min().item();
const maxval = head_attentions.max().item();
const image = RawImage.fromTensor(
head_attentions
.sub_(minval)
.div_(maxval - minval)
.mul_(255)
.to("uint8"),
);
await image.save(`attn-head-${i}.png`);
}


</details>

* Add `min`, `max`, `argmin`, `argmax` tensor ops for `dim=null`
* Add support for nearest-neighbour interpolation in `interpolate_4d`
* Depth Estimation pipeline improvements (faster & returns resized depth map)

* TypeScript improvements by ocavue and shrirajh in https://github.com/huggingface/transformers.js/pull/1081 and https://github.com/huggingface/transformers.js/pull/1122
* Remove unused imports from tokenizers.js by pratapvardhan in https://github.com/huggingface/transformers.js/pull/1116

New Contributors
* shrirajh made their first contribution in https://github.com/huggingface/transformers.js/pull/1122
* pratapvardhan made their first contribution in https://github.com/huggingface/transformers.js/pull/1116

**Full Changelog**: https://github.com/huggingface/transformers.js/compare/3.2.3...3.2.4

3.2.3

What's new?
* Fix setting of model_file_name for image feature extraction pipeline in https://github.com/huggingface/transformers.js/pull/1114. Thanks xitanggg for reporting the issue!
* Add support for dinov2 with registers in https://github.com/huggingface/transformers.js/pull/1110. Example usage:
js
import { pipeline } from 'huggingface/transformers';

// Create image classification pipeline
const classifier = await pipeline('image-classification', 'onnx-community/dinov2-with-registers-small-imagenet1k-1-layer');

// Classify an image
const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/cats.jpg';
const output = await classifier(url);
console.log(output);
// [
// { label: 'tabby, tabby cat', score: 0.8135351538658142 },
// { label: 'tiger cat', score: 0.08967583626508713 },
// { label: 'Egyptian cat', score: 0.06800546497106552 },
// { label: 'radiator', score: 0.003501888597384095 },
// { label: 'quilt, comforter, comfort, puff', score: 0.003408448537811637 },
// ]



**Full Changelog**: https://github.com/huggingface/transformers.js/compare/3.2.2...3.2.3

3.2.2

What's new?
* Fix `env.backends.onnx.wasm.proxy = true`: Clone tensor if using onnx wasm proxy in https://github.com/huggingface/transformers.js/pull/1108


**Full Changelog**: https://github.com/huggingface/transformers.js/compare/3.2.1...3.2.2

Page 2 of 16

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.