What's new?
Improved 🤗 Hub integration and model discoverability!
All Transformers.js-compatible models are now displayed with a super cool tag! To indicate your model is compatible with the library, simply add the "transformers.js" library tag in your README ([example](https://huggingface.co/Xenova/whisper-tiny/raw/main/README.md)).
![image](https://github.com/xenova/transformers.js/assets/26504141/9cc65e22-cdb4-400f-a589-863695d61842)
This also means you can now **search** for and **filter** these models by task!
![image](https://github.com/xenova/transformers.js/assets/26504141/ceadc074-a331-48a9-a9ee-a54bfcb31264)
For example,
- https://huggingface.co/models?library=transformers.js lists all Transformers.js models
- https://huggingface.co/models?library=transformers.js&pipeline_tag=feature-extraction lists all models which can be used in the `feature-extraction` pipeline!
And lastly, clicking the "Use in Transformers.js" button will show some sample code for how to use the model!
![image](https://github.com/xenova/transformers.js/assets/26504141/512e7874-f7f2-41a9-af34-d486adadcb3c)
Chroma 🤝 Transformers.js
You can now use all Transformers.js-compatible feature-extraction models for embeddings computation directly in Chroma! For example:
js
const {ChromaClient, TransformersEmbeddingFunction} = require('chromadb');
const client = new ChromaClient();
// Create the embedder. In this case, I just use the defaults, but you can change the model,
// quantization, revision, or add a progress callback, if desired.
const embedder = new TransformersEmbeddingFunction({ /* Configuration goes here */ });
const main = async () => {
// Empties and completely resets the database.
await client.reset()
// Create the collection
const collection = await client.createCollection({name: "my_collection", embeddingFunction: embedder})
// Add some data to the collection
await collection.add({
ids: ["id1", "id2", "id3"],
metadatas: [{"source": "my_source"}, {"source": "my_source"}, {"source": "my_source"}],
documents: ["I love walking my dog", "This is another document", "This is a legal document"],
})
// Query the collection
const results = await collection.query({
nResults: 2,
queryTexts: ["This is a query document"]
})
console.log(results)
// {
// ids: [ [ 'id2', 'id3' ] ],
// embeddings: null,
// documents: [ [ 'This is another document', 'This is a legal document' ] ],
// metadatas: [ [ [Object], [Object] ] ],
// distances: [ [ 1.0109775066375732, 1.0756263732910156 ] ]
// }
}
main();
Other links:
- [List of compatible models](https://huggingface.co/models?library=transformers.js&pipeline_tag=feature-extraction)
- [PR](https://github.com/chroma-core/chroma/pull/664)
Better alignment with python library for calling decoder-only models
You can now call decoder-only models loaded via `AutoModel.from_pretrained(...)`:
js
import { AutoModel, AutoTokenizer } from 'xenova/transformers';
// Choose model to use
let model_id = "Xenova/gpt2";
// Load model and tokenizer
let tokenizer = await AutoTokenizer.from_pretrained(model_id);
let model = await AutoModel.from_pretrained(model_id);
// Tokenize text and call
let model_inputs = await tokenizer('Once upon a time');
let output = await model(model_inputs);
console.log(output);
// {
// logits: Tensor {
// dims: [ 1, 4, 50257 ],
// type: 'float32',
// data: Float32Array(201028) [