Transformers-js-py

Latest version: v0.19.10

Safety actively analyzes 723756 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 16

20.166624069213867

// ...
// ],
// size: 201028
// },
// past_key_values: { ... }
// }


Examples for computing perplexity: https://github.com/xenova/transformers.js/issues/137#issuecomment-1595496161

More accurate quantization parameters for whisper models

We've updated the quantization parameters used for the pre-converted whisper models on the [hub](https://huggingface.co/models?library=transformers.js&other=whisper). You can test them out with [whisper web](https://huggingface.co/spaces/Xenova/whisper-web)! Thanks to jozefchutka for [reporting](https://github.com/xenova/transformers.js/issues/156) this issue.

![image](https://github.com/xenova/transformers.js/assets/26504141/d5ab1372-2589-46c7-8179-0cc289f663b0)

Thanks to jozefchutka for [reporting](https://github.com/xenova/transformers.js/issues/156) this issue!

Misc bug fixes and improvements
* Do not use spread operator to concatenate large arrays (https://github.com/xenova/transformers.js/pull/154)
* Set chunk timestamp to rounded time by PushpenderSaini0 (https://github.com/xenova/transformers.js/pull/160)

13.5

// ]
// }


*Note:* For now, you need to choose the `output_attentions` revision (see above). In future, we may merge these models into the main branch. Also, we currently do not have exports for the medium and large models, simply because I don't have enough RAM to do the export myself (>25GB needed) 😅 ... so, if you would like to use our [conversion script](https://huggingface.co/docs/transformers.js/custom_usage#convert-your-models-to-onnx) to do the conversion yourself, please make a PR on the hub with these new models (under a new `output_attentions` branch)!

From our testing, the JS implementation exactly matches the output produced by the Python implementation (when using the same model of course)! 🥳

![image](https://github.com/xenova/transformers.js/assets/26504141/5389443f-3d6a-4edd-99f4-8440120ad97d)

Python (left) vs. JavaScript (right)

<details>
<summary>surprise me</summary>
<br>

![image](https://github.com/xenova/transformers.js/assets/26504141/8ec87dcc-303d-461d-838c-adef920d446a)

</details>


I'm excited to see what you all build with this! Please tag me on [twitter](https://twitter.com/xenovacom) if you use it in your project - I'd love to see! I'm also planning on adding this as an option to [whisper-web](https://github.com/xenova/whisper-web), so stay tuned! 🚀

Misc bug fixes and improvements
* Fix loading of grayscale images in node.js (178)

10.22

9.92

3.4.0

🚀 Transformers.js v3.4 — Background Removal Pipeline, Ultravox DAC, Mimi, SmolVLM2, LiteWhisper.

- [🖼️ Background Removal Pipeline](new-pipeline)
- [🤖 New models: Ultravox DAC, Mimi, SmolVLM2, LiteWhisper](new-models)
- [🛠️ Other improvements](other-improvements)
- [🤗 New contributors](new-contributors)

<h2 id="new-pipeline">🖼️ New Background Removal Pipeline</h2>

Removing backgrounds from images is now as easy as:
js
import { pipeline } from "huggingface/transformers";
const segmenter = await pipeline("background-removal", "onnx-community/BEN2-ONNX");
const output = await segmenter("input.png");
output[0].save("output.png"); // (Optional) Save the image


You can find the full list of compatible models [here](https://huggingface.co/models?library=transformers.js&other=background-removal), which will continue to grow in future! 🔥 For more information, check out https://github.com/huggingface/transformers.js/pull/1216.

<h2 id="new-models">🤖 New models</h2>

* Ultravox for audio-text-to-text generation (https://github.com/huggingface/transformers.js/pull/1207). See [here](https://huggingface.co/models?library=transformers.js&other=ultravox) for the list of supported models.

<details>

<summary>
See example usage
</summary>

js
import { UltravoxProcessor, UltravoxModel, read_audio } from "huggingface/transformers";

const processor = await UltravoxProcessor.from_pretrained(
"onnx-community/ultravox-v0_5-llama-3_2-1b-ONNX",
);
const model = await UltravoxModel.from_pretrained(
"onnx-community/ultravox-v0_5-llama-3_2-1b-ONNX",
{
dtype: {
embed_tokens: "q8", // "fp32", "fp16", "q8"
audio_encoder: "q4", // "fp32", "fp16", "q8", "q4", "q4f16"
decoder_model_merged: "q4", // "q8", "q4", "q4f16"
},
},
);

const audio = await read_audio("http://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/mlk.wav", 16000);
const messages = [
{
role: "system",
content: "You are a helpful assistant.",
},
{ role: "user", content: "Transcribe this audio:<|audio|>" },
];
const text = processor.tokenizer.apply_chat_template(messages, {
add_generation_prompt: true,
tokenize: false,
});

const inputs = await processor(text, audio);
const generated_ids = await model.generate({
...inputs,
max_new_tokens: 128,
});

const generated_texts = processor.batch_decode(
generated_ids.slice(null, [inputs.input_ids.dims.at(-1), null]),
{ skip_special_tokens: true },
);
console.log(generated_texts[0]);
// "I can transcribe the audio for you. Here's the transcription:\n\n\"I have a dream that one day this nation will rise up and live out the true meaning of its creed.\"\n\n- Martin Luther King Jr.\n\nWould you like me to provide the transcription in a specific format (e.g., word-for-word, character-for-character, or a specific font)?"

</details>

* DAC and Mimi for audio tokenization/neural audio codecs (https://github.com/huggingface/transformers.js/pull/1215). See [here](https://huggingface.co/models?library=transformers.js&other=dac) for the list of supported DAC models and [here](https://huggingface.co/models?library=transformers.js&other=mimi) for the list of supported Mimi models.

<details>

<summary>
See example usage
</summary>

DAC:
js
import { DacModel, AutoFeatureExtractor } from 'huggingface/transformers';

const model_id = "onnx-community/dac_16khz-ONNX";
const model = await DacModel.from_pretrained(model_id);
const feature_extractor = await AutoFeatureExtractor.from_pretrained(model_id);

const audio_sample = new Float32Array(12000);

// pre-process the inputs
const inputs = await feature_extractor(audio_sample);
{
// explicitly encode then decode the audio inputs
const encoder_outputs = await model.encode(inputs);
const { audio_values } = await model.decode(encoder_outputs);
console.log(audio_values);
}

{
// or the equivalent with a forward pass
const { audio_values } = await model(inputs);
console.log(audio_values);
}


Mimi:
js
import { MimiModel, AutoFeatureExtractor } from 'huggingface/transformers';

const model_id = "onnx-community/kyutai-mimi-ONNX";
const model = await MimiModel.from_pretrained(model_id);
const feature_extractor = await AutoFeatureExtractor.from_pretrained(model_id);

const audio_sample = new Float32Array(12000);

// pre-process the inputs
const inputs = await feature_extractor(audio_sample);
{
// explicitly encode then decode the audio inputs
const encoder_outputs = await model.encode(inputs);
const { audio_values } = await model.decode(encoder_outputs);
console.log(audio_values);
}

{
// or the equivalent with a forward pass
const { audio_values } = await model(inputs);
console.log(audio_values);
}

</details>

* SmolVLM2, a lightweight multimodal model designed to analyze image and video content (https://github.com/huggingface/transformers.js/pull/1196). See [here](https://huggingface.co/models?library=onnx&other=smolvlm&sort=trending) for the list of supported models. Usage is identical to SmolVLM.
* LiteWhisper for automatic speech recognition (https://github.com/huggingface/transformers.js/pull/1219). See [here](https://huggingface.co/models?library=transformers.js&other=lite-whisper&sort=trending) for the list of supported models. Usage is identical to Whisper.


<h2 id="other-improvements">🛠️ Other improvements</h2>

* Add support for multi-chunk external data files in https://github.com/huggingface/transformers.js/pull/1212
* Fix package export by fs-eire in https://github.com/huggingface/transformers.js/pull/1161
* Add NFD normalizer in https://github.com/huggingface/transformers.js/pull/1211. Thanks to adewdev for reporting!
* Documentation improvements by viksit in https://github.com/huggingface/transformers.js/pull/1184
* Optimize conversion script in https://github.com/huggingface/transformers.js/pull/1204 and https://github.com/huggingface/transformers.js/pull/1218
* Use Float16Array instead of Uint16Array for kvcache when available in https://github.com/huggingface/transformers.js/pull/1208

<h2 id="new-contributors">🤗 New contributors</h2>

* axrati made their first contribution in https://github.com/huggingface/transformers.js/pull/602
* viksit made their first contribution in https://github.com/huggingface/transformers.js/pull/1184
* tangkunyin made their first contribution in https://github.com/huggingface/transformers.js/pull/1203

**Full Changelog**: https://github.com/huggingface/transformers.js/compare/3.3.3...3.4.0

3.3.3

What's new?

* Bump `onnxruntime-web` and `huggingface/jinja` in https://github.com/huggingface/transformers.js/pull/1183.


**Full Changelog**: https://github.com/huggingface/transformers.js/compare/3.3.2...3.3.3

Page 1 of 16

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.