Transformers-js-py

Latest version: v0.19.10

Safety actively analyzes 724020 Python packages for vulnerabilities to keep your Python projects secure.

Page 8 of 16

2.7.0

What's new?

🗣️ New task: Text to speech/audio

Due to popular demand, we've added `text-to-speech` support to Transformers.js! 😍

https://github.com/xenova/transformers.js/assets/26504141/9fa5131d-0e07-47fa-9a13-122c1b69d233

You can get started in just a few lines of code!

js
import { pipeline } from 'xenova/transformers';

let speaker_embeddings = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/speaker_embeddings.bin';
let synthesizer = await pipeline('text-to-speech', 'Xenova/speecht5_tts', { quantized: false });
let out = await synthesizer('Hello, my dog is cute', { speaker_embeddings });
// {
// audio: Float32Array(26112) [-0.00005657337896991521, 0.00020583874720614403, ...],
// sampling_rate: 16000
// }

You can then save the audio to a .wav file with the [wavefile](https://www.npmjs.com/package/wavefile) package:
js
import wavefile from 'wavefile';
import fs from 'fs';

let wav = new wavefile.WaveFile();
wav.fromScratch(1, out.sampling_rate, '32f', out.audio);
fs.writeFileSync('out.wav', wav.toBuffer());

Alternatively, you can play the file in your browser (see below).

Don't like the speaker's voice? Well, you can choose another from the >7000 speaker embeddings in the [CMU Arctic](http://www.festvox.org/cmu_arctic/) dataset (see [here](https://huggingface.co/datasets/Xenova/cmu-arctic-xvectors-extracted/tree/main))!

_Note:_ currently, we only support TTS w/ speecht5, but in future we'll add others like bark and MMS!

🖥️ TTS demo and example app

To showcase the power of in-browser TTS, we're also releasing a simple example app ([demo](https://huggingface.co/spaces/Xenova/text-to-speech-client), [code](https://github.com/xenova/transformers.js/tree/main/examples/text-to-speech-client)). Feel free to make improvements to it... and if you do (or end up building your own), please tag me on [Twitter](https://twitter.com/xenovacom)! 🤗

https://github.com/xenova/transformers.js/assets/26504141/98adea31-b002-403b-ba9d-1edcc7e7bf11

Misc. changes
* Update falcon tokenizer in https://github.com/xenova/transformers.js/pull/344
* Add more links to example section in https://github.com/xenova/transformers.js/pull/343
* Improve electron example template in https://github.com/xenova/transformers.js/pull/342
* Update example app dependencies in https://github.com/xenova/transformers.js/pull/347
* Do not post-process `<` and `>` symbols generated from docs in https://github.com/xenova/transformers.js/pull/335

**Full Changelog**: https://github.com/xenova/transformers.js/compare/2.6.2...2.7.0

2.6.2

What's new?

📝 New task: Document Question Answering

Document Question Answering is the task of answering questions based on an image of a document. Document Question Answering models take a (document, question) pair as input and return an answer in natural language. Check out the [docs](https://huggingface.co/tasks/document-question-answering) for more info!

![image](https://github.com/xenova/transformers.js/assets/26504141/aa16a1e6-221f-4788-b3ef-3273ec092d1e)

<details>

<summary>
Example code
</summary>

js
// npm i xenova/transformers
import { pipeline } from 'xenova/transformers';

let image = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/invoice.png';
let question = 'What is the invoice number?';

// Create document question answering pipeline
let qa_pipeline = await pipeline('document-question-answering', 'Xenova/donut-base-finetuned-docvqa');

// Run the pipeline
let output = await qa_pipeline(image, question);
// [{ answer: 'us-001' }]

</details>

🤖 New models
* Add support for `DonutSwin` models in https://github.com/xenova/transformers.js/pull/320
* Add support for `Blenderbot` and `BlenderbotSmall` in https://github.com/xenova/transformers.js/pull/292
* Add support for `LongT5` models https://github.com/xenova/transformers.js/pull/316

💻 New example application
* In-browser semantic image search in https://github.com/xenova/transformers.js/pull/326 ([demo](https://huggingface.co/spaces/Xenova/semantic-image-search-client), [code](https://github.com/xenova/transformers.js/tree/main/examples/semantic-image-search-client), [tweet](https://twitter.com/xenovacom/status/1705385934072742015))

https://github.com/xenova/transformers.js/assets/26504141/c2ea6e69-2344-401e-8745-fdea3a0613ad

🐛 Misc. improvements
* Fixing more `_call` LSP errors + extra typings by kungfooman in https://github.com/xenova/transformers.js/pull/304
* Remove `CustomCache` requirement for example browser extension project in https://github.com/xenova/transformers.js/pull/325

**Full Changelog**: https://github.com/xenova/transformers.js/compare/2.6.1...2.6.2

2.6.1

What's new?
* Add Vanilla JavaScript tutorial by perborgen in https://github.com/xenova/transformers.js/pull/271. This includes an [**interactive video tutorial**](https://scrimba.com/scrim/cKm9bDAg) ("scrim"), which walks you through the code! Let us know if you want to see more of these video tutorials! 🤗

[![image](https://github.com/xenova/transformers.js/assets/26504141/6015da0e-9409-4646-baf8-0abedd804f8d)](https://scrimba.com/scrim/cKm9bDAg)

* Add support for `min_length` and `min_new_tokens` generation parameters in https://github.com/xenova/transformers.js/pull/308
* Fix issues with minification in https://github.com/xenova/transformers.js/pull/307
* Fix `ByteLevel` pretokenizer and improve whisper test cases in https://github.com/xenova/transformers.js/pull/287
* Misc. documentation improvements by rubiagatra in https://github.com/xenova/transformers.js/pull/293

New Contributors
* rubiagatra made their first contribution in https://github.com/xenova/transformers.js/pull/293

**Full Changelog**: https://github.com/xenova/transformers.js/compare/2.6.0...2.6.1

2.6.0

What's new?

🤯 14 new architectures
In this release, we've added a ton of new architectures: BLOOM, MPT, BeiT, CamemBERT, CodeLlama, GPT NeoX, GPT-J, HerBERT, mBART, mBART-50, OPT, ResNet, WavLM, and XLM. This brings the total number of supported architectures up to 46! Here's some example code to help you get started:

- Text-generation with **MPT** ([models](https://huggingface.co/models?library=transformers.js&other=mpt)):
js
import { pipeline } from 'xenova/transformers';
const generator = await pipeline('text-generation', 'Xenova/ipt-350m', {
quantized: false, // using unquantized to ensure it matches python version
});

const output = await generator('La nostra azienda');
// { generated_text: "La nostra azienda è specializzata nella vendita di prodotti per l'igiene orale e per la salute." }

Other text-generation models: [**BLOOM**](https://huggingface.co/models?library=transformers.js&other=bloom), [**GPT-NeoX**](https://huggingface.co/models?library=transformers.js&other=gpt_neox&sort=trending), **CodeLlama**, [**GPT-J**](https://huggingface.co/models?library=transformers.js&other=gptj&sort=trending), [**OPT**](https://huggingface.co/models?library=transformers.js&other=opt&sort=trending).

- **CamemBERT** for masked language modelling, text classification, token classification, question answering, and feature extraction ([models](https://huggingface.co/models?library=transformers.js&other=camembert)). For example:
js
import { pipeline } from 'xenova/transformers';
let pipe = await pipeline('token-classification', 'Xenova/camembert-ner-with-dates');
let output = await pipe("Je m'appelle jean-baptiste et j'habite à montréal depuis fevr 2012");
// [
// { entity: 'I-PER', score: 0.9258053302764893, index: 5, word: 'jean' },
// { entity: 'I-PER', score: 0.9048717617988586, index: 6, word: '-' },
// { entity: 'I-PER', score: 0.9227054119110107, index: 7, word: 'ba' },
// { entity: 'I-PER', score: 0.9385354518890381, index: 8, word: 'pt' },
// { entity: 'I-PER', score: 0.9139659404754639, index: 9, word: 'iste' },
// { entity: 'I-LOC', score: 0.9877734780311584, index: 15, word: 'montré' },
// { entity: 'I-LOC', score: 0.9891639351844788, index: 16, word: 'al' },
// { entity: 'I-DATE', score: 0.9858269691467285, index: 18, word: 'fe' },
// { entity: 'I-DATE', score: 0.9780661463737488, index: 19, word: 'vr' },
// { entity: 'I-DATE', score: 0.980688214302063, index: 20, word: '2012' }
// ]

![image](https://github.com/xenova/transformers.js/assets/26504141/4686afc2-db60-4b95-830f-ff60b4e51552)

- **WavLM** for feature-extraction ([models](https://huggingface.co/models?library=transformers.js&other=wavlm)). For example:
js
import { AutoProcessor, AutoModel, read_audio } from 'xenova/transformers';

// Read and preprocess audio
const processor = await AutoProcessor.from_pretrained('Xenova/wavlm-base');
const audio = await read_audio('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/jfk.wav', 16000);
const inputs = await processor(audio);

// Run model with inputs
const model = await AutoModel.from_pretrained('Xenova/wavlm-base');
const output = await model(inputs);
// {
// last_hidden_state: Tensor {
// dims: [ 1, 549, 768 ],
// type: 'float32',
// data: Float32Array(421632) [-0.349443256855011, -0.39341306686401367, 0.022836603224277496, ...],
// size: 421632
// }
// }

- **MBart** +**MBart50** for multilingual translation ([models](https://huggingface.co/models?library=transformers.js&other=mbart)). For example:
js
import { pipeline } from 'xenova/transformers';
let translator = await pipeline('translation', 'Xenova/mbart-large-50-many-to-many-mmt');
let output = await translator('संयुक्त राष्ट्र के प्रमुख का कहना है कि सीरिया में कोई सैन्य समाधान नहीं है', {
src_lang: 'hi_IN', // Hindi
tgt_lang: 'fr_XX', // French
});
// [{ translation_text: 'Le chef des Nations affirme qu 'il n 'y a military solution in Syria.' }]

See [here](https://huggingface.co/facebook/mbart-large-50-many-to-many-mmt#languages-covered) for the full list of languages and their corresponding codes.

- **BeiT** for image classification ([models](https://huggingface.co/models?library=transformers.js&other=beit)):
js
import { pipeline } from 'xenova/transformers';
let url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg';
let pipe = await pipeline('image-classification', 'Xenova/beit-base-patch16-224');
let output = await pipe(url);
// [{ label: 'tiger, Panthera tigris', score: 0.7168469429016113 }]

- **ResNet** for image classification ([models](https://huggingface.co/models?library=transformers.js&other=resnet)):
js
import { pipeline } from 'xenova/transformers';
let url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg';
let pipe = await pipeline('image-classification', 'Xenova/resnet-50');
let output = await pipe(url);
// [{ label: 'tiger, Panthera tigris', score: 0.7576608061790466 }]

😍 Over 150 newly-converted models

To get started with these new architectures (and expand coverage for other models), we're releasing over [150 new models](https://huggingface.co/models?library=transformers.js&sort=modified) on the Hugging Face Hub! Check out the full list [here](https://huggingface.co/models?library=transformers.js).

![image](https://github.com/xenova/transformers.js/assets/26504141/d9f8cd26-83df-4b9d-adf8-a56ac944012a)

🏋️ HUGE reduction in model sizes (up to -40%)

Thanks to a recent update of [🤗 Optimum](https://github.com/huggingface/optimum), we were able to remove duplicate weights across various models. In some cases, like `whisper-tiny`'s decoder, this resulted in a 40% reduction in size! Here are some improvements we saw:
- Whisper-tiny decoder: 50MB → 30MB (-40%)
- NLLB decoder: 732MB → 476MB (-35%)
- bloom: 819MB → 562MB (-31%)
- T5 decoder: 59MB → 42MB (-28%)
- distilbert-base: 91MB → 68MB (-25%)
- bart-base decoder: 207MB → 155MB (-25%)
- roberta-base: 165MB → 126MB (-24%)
- gpt2: 167MB → 127MB (-24%)
- bert-base: 134MB → 111MB (-17%)
- many more!

Play around with some of the smaller whisper models (for automatic speech recognition) [here](https://huggingface.co/spaces/Xenova/whisper-web)!

![whisper-smaller-models](https://github.com/xenova/transformers.js/assets/26504141/9e59e049-8d50-4847-8268-5f8db2e095c6)

Other
- Transformers.js integration with LangChain JS ([docs](https://js.langchain.com/docs/modules/data_connection/text_embedding/integrations/transformers))

js
import { HuggingFaceTransformersEmbeddings } from "langchain/embeddings/hf_transformers";

const model = new HuggingFaceTransformersEmbeddings({
modelName: "Xenova/all-MiniLM-L6-v2",
});

/* Embed queries */
const res = await model.embedQuery(
"What would be a good company name for a company that makes colorful socks?"
);
console.log({ res });
/* Embed documents */
const documentRes = await model.embedDocuments(["Hello world", "Bye bye"]);
console.log({ documentRes });

- Refactored `PreTrainedModel` to require significantly less code when adding new models
- Typing improvements by kungfooman

2.5.4

What's new?
* Add support for 3 new vision architectures (Swin, DeiT, Yolos) in https://github.com/xenova/transformers.js/pull/262. Check out the [Hugging Face Hub](https://huggingface.co/models?library=transformers.js) to see which models you can use!
- [Swin](https://huggingface.co/models?library=transformers.js&other=swin&sort=trending) for image classification. e.g.:
js
let url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg';
let classifier = await pipeline('image-classification', 'Xenova/swin-base-patch4-window7-224-in22k');
let output = await classifier(url, { topk: null });
// [
// { label: 'Bengal_tiger', score: 0.2258443683385849 },
// { label: 'tiger, Panthera_tigris', score: 0.21161635220050812 },
// { label: 'predator, predatory_animal', score: 0.09135803580284119 },
// { label: 'tigress', score: 0.08038495481014252 },
// // ... 21838 more items
// ]

- [DeiT](https://huggingface.co/models?library=transformers.js&other=deit&sort=trending) for image classification. e.g.,:
js
let url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg';
let classifier = await pipeline('image-classification', 'Xenova/deit-tiny-distilled-patch16-224');
let output = await classifier(url);
// [{ label: 'tiger, Panthera tigris', score: 0.9804046154022217 }]

- [Yolos](https://huggingface.co/models?library=transformers.js&other=yolos&sort=trending) for object detection. e.g.,:
js
let url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/cats.jpg';
let detector = await pipeline('object-detection', 'Xenova/yolos-small-300');
let output = await detector(url);
// [
// { label: 'remote', score: 0.9837935566902161, box: { xmin: 331, ymin: 80, xmax: 367, ymax: 192 } },
// { label: 'cat', score: 0.94994056224823, box: { xmin: 8, ymin: 57, xmax: 316, ymax: 470 } },
// { label: 'couch', score: 0.9843178987503052, box: { xmin: 0, ymin: 0, xmax: 639, ymax: 474 } },
// { label: 'remote', score: 0.9704685211181641, box: { xmin: 39, ymin: 71, xmax: 179, ymax: 114 } },
// { label: 'cat', score: 0.9921762943267822, box: { xmin: 339, ymin: 17, xmax: 642, ymax: 380 } }
// ]

* Documentation improvements by perborgen in https://github.com/xenova/transformers.js/pull/261

New contributors 🤗
* perborgen made their first contribution in https://github.com/xenova/transformers.js/pull/261

**Full Changelog**: https://github.com/xenova/transformers.js/compare/2.5.3...2.5.4

2.5.3

What's new?
* Fix whisper timestamps for non-English languages in https://github.com/xenova/transformers.js/pull/253
* Fix caching for some LFS files from the Hugging Face Hub in https://github.com/xenova/transformers.js/pull/251
* Improve documentation (w/ example code and links) in https://github.com/xenova/transformers.js/pull/255 and https://github.com/xenova/transformers.js/pull/257. Thanks josephrocca for helping with this!

New contributors 🤗
* josephrocca made their first contribution in https://github.com/xenova/transformers.js/pull/257

**Full Changelog**: https://github.com/xenova/transformers.js/compare/2.5.2...2.5.3

Page 8 of 16

Releases

Has known vulnerabilities

Previous Next

Transformers-js-py

Page 8 of 16

2.7.0

2.6.2

2.6.1

2.6.0

2.5.4

2.5.3

Page 8 of 16

Links

Releases