What's Changed
- New Python API gen_processing_models to export ONNX data processing model from Huggingface Tokenizers such as LLaMA , CLIP, XLM-Roberta, Falcon, BERT, etc.
- New TrieTokenizer operator for RWKV-like LLM models, and other tokenizer operator enhancements.
- New operators for Azure EP compatibility: AzureAudioToText, AzureTextToText, AzureTritonInvoker for Python and NuGet packages.
- Processing operators have been migrated to the new [Lite Custom Op API](https://github.com/microsoft/onnxruntime/blob/gh-pages/docs/reference/operators/add-custom-op.md#define-and-register-a-custom-operator)
- New operator of string strip
- Using the latest Ort header instead of minimum compatible headers
- Support offset mapping in most tokenizers like BERT, CLIP, Roberta and etc.
- Remove the deprecating std::codecvt_utf8 from code base
- Document are uploaded to https://onnxruntime.ai/docs/extensions/
Contributions
Contributors to ONNX Runtime Extensions include members across teams at Microsoft, along with our community members: aidanryan-msft RandySheriffH edgchen1 kunal-vaishnavi sayanshaw24 skottmckay snnn VishalX wenbingl wejoncy