[OVModel](https://github.com/huggingface/optimum-intel/blob/main/optimum/intel/openvino/modeling_base.py#L57) classes were integrated with the [🤗 Hub](https://hf.co/models) in order to easily export models through the OpenVINO IR, save and load those resulting models, as well as to easily perform inference.
* Add OVModel classes enabling OpenVINO inference 21
Below is an example that downloads a DistilBERT model from the Hub, exports it through the OpenVINO IR and saves it:
python
from optimum.intel.openvino import OVModelForSequenceClassification
model_id = "distilbert-base-uncased-finetuned-sst-2-english"
model = OVModelForSequenceClassification.from_pretrained(model_id, from_transformers=True)
model.save_pretrained(save_directory)
The currently supported model topologies are the following :
* `OVModelForSequenceClassification`
* `OVModelForTokenClassification`
* `OVModelForQuestionAnswering`
* `OVModelForFeatureExtraction`
* `OVModelForMaskedLM`
* `OVModelForImageClassification`
* `OVModelForSeq2SeqLM`
Pipelines
The Transformers [pipelines](https://huggingface.co/docs/transformers/main/en/main_classes/pipelines#pipelines) support was added, providing an easy way to use OVModels for inference.
diff
-from transformers import AutoModelForSeq2SeqLM
+from optimum.intel.openvino import OVModelForSeq2SeqLM
from transformers import AutoTokenizer, pipeline
model_id = "Helsinki-NLP/opus-mt-en-fr"
-model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
+model = OVModelForSeq2SeqLM.from_pretrained(model_id, from_transformers=True)
tokenizer = AutoTokenizer.from_pretrained(model_id)
pipe = pipeline("translation_en_to_fr", model=model, tokenizer=tokenizer)
text = "He never went out without a book under his arm, and he often came back with two."
outputs = pipe(text)
By default, OVModels support dynamic shapes enabling inputs of every shapes (without any constraint on the batch size or sequence length). To decrease latency, static shapes can be enabled by giving the desired inputs shapes.
* Add OVModel static shapes 41
python
model.reshape(1, 20)
FP16 precision can also be enabled.
* Add OVModel fp16 support 45
python
model.half()