Added
- Added support for decoder models such as the GPT-series.
- Added new Swedish sentiment classification dataset, SweReC, which is not
aspect-based, contrary to the previous ABSAbank-Imm dataset. This dataset is a
three-way classification task into the classical `positive`, `neutral` and `negative`
classes, thereby establishing uniformity between the sentiment classification
datasets in the different languages. The dataset comes from reviews from both
se.trustpilot.com and reco.se, and has been created by Kristoffer Svensson as part of
his Bachelor thesis "Sentiment Analysis With Convolutional Neural Networks:
Classifying sentiment in Swedish reviews".
- Added historic BERT models from `dbmdz` as part of the default multilingual list.
- Added the `--batch-size` argument, which can be used to manually select a batch size.
Must be among 1, 2, 4, 8, 16 and 32.
Removed
- As SweReC is a drop-in replacement for ABSAbank-Imm, the latter has been removed from
the ScandEval benchmark.
Fixed
- Now deals with an issue with DeBERTaV2 models where `pooler_hidden_size` has been set
to a value different to `hidden_size` in its configuration, which made it impossible
to do sequence classification with the model. The former is now forced to be the same
as the latter, fixing the issue.
- Now ensures that tokenizers, model configurations and metrics are cached to the
ScandEval cache, rather than the default Hugging Face cache.
- Previously, if a model's context length was greater than 1,000 it would be reduced to
512, since an unset context length results in a very large `model_max_length` value
of the tokenizer. This conflicted with longformer-style models whose context length
_actually_ was greater than 1,000, so now this upper bound has been increased to
100,000.
- Now includes `sacremoses` as a dependency, as this is required by some tokenizers.
- Converted the `id` column in ScandiQA to a string, to avoid integer overflow errors
during preprocessing.
- If there is a `torch` operation which does not have a deterministic component, then a
warning will be issued instead of raising an error.