Fixed
- When benchmarking a token classification dataset with a model whose tokenizer
does not have a fast variant yet, this raised an error as the `word_ids`
method of `BatchEncoding` objects only works when the tokenizer is fast. In
that case these word IDs are now computed manually. This can currently handle
WordPiece and SentencePiece prefixes (i.e., `` and `▁`), and will raise an
error if the manual alignment of words and tokens fail.
- Catch the CUDA error `CUDA error: CUBLAS_STATUS_ALLOC_FAILED`, which in this
case is due to OOM.