- Support 53 new pretrained models including BART, mBART, BORT, DeBERTa, T5, mT5, BERTweet, MPNet, ConvBERT, SqueezeBERT, SpanBERT, PEGASUS, Longformer, LED, Blendbot, etc. Among them, DeBERTa achives higher correlation with human scores than RoBERTa (our default) on WMT16 dataset. The correlations are presented in this [Google sheet](https://docs.google.com/spreadsheets/d/1RKOVpselB98Nnh_EOC4A2BYn8_201tmPODpNWu4w7xI/edit?usp=sharing).
- Please consider using `--model_type microsoft/deberta-xlarge-mnli` or `--model_type microsoft/deberta-large-mnli` (faster) if you want the scores to correlate better with human scores.
- Add baseline files for DeBERTa models.
- Add example code to generate baseline files (please see the [details](get_rescale_baseline)).