What's new?
* Add support for the `image-feature-extraction` pipeline in https://github.com/xenova/transformers.js/pull/650.
**Example:** Perform image feature extraction with `Xenova/vit-base-patch16-224-in21k`.
javascript
const image_feature_extractor = await pipeline('image-feature-extraction', 'Xenova/vit-base-patch16-224-in21k');
const url = 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/cats.png';
const features = await image_feature_extractor(url);
// Tensor {
// dims: [ 1, 197, 768 ],
// type: 'float32',
// data: Float32Array(151296) [ ... ],
// size: 151296
// }
**Example:** Compute image embeddings with `Xenova/clip-vit-base-patch32`.
javascript
const image_feature_extractor = await pipeline('image-feature-extraction', 'Xenova/clip-vit-base-patch32');
const url = 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/cats.png';
const features = await image_feature_extractor(url);
// Tensor {
// dims: [ 1, 512 ],
// type: 'float32',
// data: Float32Array(512) [ ... ],
// size: 512
// }
* Fix channel format when padding non-square images for certain models in https://github.com/xenova/transformers.js/pull/655. This means you can now perform super-resolution for non-square images with [APISR](https://github.com/Kiteretsu77/APISR) models:
**Example:** Upscale an image with `Xenova/4x_APISR_GRL_GAN_generator-onnx`.
js
import { pipeline } from 'xenova/transformers';
// Create image-to-image pipeline
const upscaler = await pipeline('image-to-image', 'Xenova/4x_APISR_GRL_GAN_generator-onnx', {
quantized: false,
});
// Upscale an image
const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/anime.png';
const output = await upscaler(url);
// RawImage {
// data: Uint8Array(16588800) [ ... ],
// width: 2560,
// height: 1920,
// channels: 3
// }
// (Optional) Save the upscaled image
output.save('upscaled.png');
<details>
<summary>See example output</summary>
Input image:
![image](https://github.com/xenova/transformers.js/assets/26504141/b5a0bed5-6348-4c71-8dd8-886a48f4d8fa)
Output image:
![image](https://github.com/xenova/transformers.js/assets/26504141/4d69e6d8-4c02-433c-970a-96bf48c41368)
</details>
* Update tokenizer `apply_chat_template` functionality in https://github.com/xenova/transformers.js/pull/647. This PR added functionality to support the new [C4AI Command-R tokenizer](https://huggingface.co/CohereForAI/c4ai-command-r-v01).
<details>
<summary>See example tool usage</summary>
js
import { AutoTokenizer } from "xenova/transformers";
const tokenizer = await AutoTokenizer.from_pretrained("Xenova/c4ai-command-r-v01-tokenizer")
// define conversation input:
const conversation = [
{ role: "user", content: "Whats the biggest penguin in the world?" }
]
// Define tools available for the model to use:
const tools = [
{
name: "internet_search",
description: "Returns a list of relevant document snippets for a textual query retrieved from the internet",
parameter_definitions: {
query: {
description: "Query to search the internet with",
type: "str",
required: true
}
}
},
{
name: "directly_answer",
description: "Calls a standard (un-augmented) AI chatbot to generate a response given the conversation history",
parameter_definitions: {}
}
]
// render the tool use prompt as a string:
const tool_use_prompt = tokenizer.apply_chat_template(
conversation,
{
chat_template: "tool_use",
tokenize: false,
add_generation_prompt: true,
tools,
}
)
console.log(tool_use_prompt)
</details>
<details>
<summary>See example RAG usage</summary>
js
import { AutoTokenizer } from "xenova/transformers";
const tokenizer = await AutoTokenizer.from_pretrained("Xenova/c4ai-command-r-v01-tokenizer")
// define conversation input:
const conversation = [
{ role: "user", content: "Whats the biggest penguin in the world?" }
]
// define documents to ground on:
const documents = [
{ title: "Tall penguins", text: "Emperor penguins are the tallest growing up to 122 cm in height." },
{ title: "Penguin habitats", text: "Emperor penguins only live in Antarctica." }
]
// render the RAG prompt as a string:
const grounded_generation_prompt = tokenizer.apply_chat_template(
conversation,
{
chat_template: "rag",
tokenize: false,
add_generation_prompt: true,
documents,
citation_mode: "accurate", // or "fast"
}
)
console.log(grounded_generation_prompt);
</details>
* Add support for EfficientNet in https://github.com/xenova/transformers.js/pull/639.
**Example:** Classify images with `chriamue/bird-species-classifier`
js
import { pipeline } from 'xenova/transformers';
// Create image classification pipeline
const classifier = await pipeline('image-classification', 'chriamue/bird-species-classifier', {
quantized: false, // Quantized model doesn't work
revision: 'refs/pr/1', // Needed until the model author merges the PR
});
// Classify an image
const url = 'https://upload.wikimedia.org/wikipedia/commons/7/73/Short_tailed_Albatross1.jpg';
const output = await classifier(url);
console.log(output)
// [{ label: 'ALBATROSS', score: 0.9999023079872131 }]
**Full Changelog**: https://github.com/xenova/transformers.js/compare/2.16.0...2.16.1