What's new?
🤯 14 new architectures
In this release, we've added a ton of new architectures: BLOOM, MPT, BeiT, CamemBERT, CodeLlama, GPT NeoX, GPT-J, HerBERT, mBART, mBART-50, OPT, ResNet, WavLM, and XLM. This brings the total number of supported architectures up to 46! Here's some example code to help you get started:
- Text-generation with **MPT** ([models](https://huggingface.co/models?library=transformers.js&other=mpt)):
import { pipeline } from 'xenova/transformers';
const generator = await pipeline('text-generation', 'Xenova/ipt-350m', {
quantized: false, // using unquantized to ensure it matches python version
const output = await generator('La nostra azienda');
// { generated_text: "La nostra azienda è specializzata nella vendita di prodotti per l'igiene orale e per la salute." }
Other text-generation models: [**BLOOM**](https://huggingface.co/models?library=transformers.js&other=bloom), [**GPT-NeoX**](https://huggingface.co/models?library=transformers.js&other=gpt_neox&sort=trending), **CodeLlama**, [**GPT-J**](https://huggingface.co/models?library=transformers.js&other=gptj&sort=trending), [**OPT**](https://huggingface.co/models?library=transformers.js&other=opt&sort=trending).
- **CamemBERT** for masked language modelling, text classification, token classification, question answering, and feature extraction ([models](https://huggingface.co/models?library=transformers.js&other=camembert)). For example:
import { pipeline } from 'xenova/transformers';
let pipe = await pipeline('token-classification', 'Xenova/camembert-ner-with-dates');
let output = await pipe("Je m'appelle jean-baptiste et j'habite à montréal depuis fevr 2012");
// [
// { entity: 'I-PER', score: 0.9258053302764893, index: 5, word: 'jean' },
// { entity: 'I-PER', score: 0.9048717617988586, index: 6, word: '-' },
// { entity: 'I-PER', score: 0.9227054119110107, index: 7, word: 'ba' },
// { entity: 'I-PER', score: 0.9385354518890381, index: 8, word: 'pt' },
// { entity: 'I-PER', score: 0.9139659404754639, index: 9, word: 'iste' },
// { entity: 'I-LOC', score: 0.9877734780311584, index: 15, word: 'montré' },
// { entity: 'I-LOC', score: 0.9891639351844788, index: 16, word: 'al' },
// { entity: 'I-DATE', score: 0.9858269691467285, index: 18, word: 'fe' },
// { entity: 'I-DATE', score: 0.9780661463737488, index: 19, word: 'vr' },
// { entity: 'I-DATE', score: 0.980688214302063, index: 20, word: '2012' }
// ]
- **WavLM** for feature-extraction ([models](https://huggingface.co/models?library=transformers.js&other=wavlm)). For example:
import { AutoProcessor, AutoModel, read_audio } from 'xenova/transformers';
// Read and preprocess audio
const processor = await AutoProcessor.from_pretrained('Xenova/wavlm-base');
const audio = await read_audio('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/jfk.wav', 16000);
const inputs = await processor(audio);
// Run model with inputs
const model = await AutoModel.from_pretrained('Xenova/wavlm-base');
const output = await model(inputs);
// {
// last_hidden_state: Tensor {
// dims: [ 1, 549, 768 ],
// type: 'float32',
// data: Float32Array(421632) [-0.349443256855011, -0.39341306686401367, 0.022836603224277496, ...],
// size: 421632
// }
// }
- **MBart** +**MBart50** for multilingual translation ([models](https://huggingface.co/models?library=transformers.js&other=mbart)). For example:
import { pipeline } from 'xenova/transformers';
let translator = await pipeline('translation', 'Xenova/mbart-large-50-many-to-many-mmt');
let output = await translator('संयुक्त राष्ट्र के प्रमुख का कहना है कि सीरिया में कोई सैन्य समाधान नहीं है', {
src_lang: 'hi_IN', // Hindi
tgt_lang: 'fr_XX', // French
// [{ translation_text: 'Le chef des Nations affirme qu 'il n 'y a military solution in Syria.' }]
See [here](https://huggingface.co/facebook/mbart-large-50-many-to-many-mmt#languages-covered) for the full list of languages and their corresponding codes.
- **BeiT** for image classification ([models](https://huggingface.co/models?library=transformers.js&other=beit)):
import { pipeline } from 'xenova/transformers';
let url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg';
let pipe = await pipeline('image-classification', 'Xenova/beit-base-patch16-224');
let output = await pipe(url);
// [{ label: 'tiger, Panthera tigris', score: 0.7168469429016113 }]
- **ResNet** for image classification ([models](https://huggingface.co/models?library=transformers.js&other=resnet)):
import { pipeline } from 'xenova/transformers';
let url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg';
let pipe = await pipeline('image-classification', 'Xenova/resnet-50');
let output = await pipe(url);
// [{ label: 'tiger, Panthera tigris', score: 0.7576608061790466 }]
😍 Over 150 newly-converted models
To get started with these new architectures (and expand coverage for other models), we're releasing over [150 new models](https://huggingface.co/models?library=transformers.js&sort=modified) on the Hugging Face Hub! Check out the full list [here](https://huggingface.co/models?library=transformers.js).
🏋️ HUGE reduction in model sizes (up to -40%)
Thanks to a recent update of [🤗 Optimum](https://github.com/huggingface/optimum), we were able to remove duplicate weights across various models. In some cases, like `whisper-tiny`'s decoder, this resulted in a 40% reduction in size! Here are some improvements we saw:
- Whisper-tiny decoder: 50MB → 30MB (-40%)
- NLLB decoder: 732MB → 476MB (-35%)
- bloom: 819MB → 562MB (-31%)
- T5 decoder: 59MB → 42MB (-28%)
- distilbert-base: 91MB → 68MB (-25%)
- bart-base decoder: 207MB → 155MB (-25%)
- roberta-base: 165MB → 126MB (-24%)
- gpt2: 167MB → 127MB (-24%)
- bert-base: 134MB → 111MB (-17%)
- many more!
Play around with some of the smaller whisper models (for automatic speech recognition) [here](https://huggingface.co/spaces/Xenova/whisper-web)!
- Transformers.js integration with LangChain JS ([docs](https://js.langchain.com/docs/modules/data_connection/text_embedding/integrations/transformers))
import { HuggingFaceTransformersEmbeddings } from "langchain/embeddings/hf_transformers";
const model = new HuggingFaceTransformersEmbeddings({
modelName: "Xenova/all-MiniLM-L6-v2",
/* Embed queries */
const res = await model.embedQuery(
"What would be a good company name for a company that makes colorful socks?"
console.log({ res });
/* Embed documents */
const documentRes = await model.embedDocuments(["Hello world", "Bye bye"]);
console.log({ documentRes });
- Refactored `PreTrainedModel` to require significantly less code when adding new models
- Typing improvements by kungfooman