Added
- Document attributes are now supported (both for text and audio) and are added/accessed the same way as annotations attributes
- Brat Input and Output converters can now load and save UMLS CUIs stored in notes
- new from_dir()/from_file() helper methods added to TextDocument/AudioDocument
- new text classification, audio diarization and audio transcription metrics
Changed
- the Trainer now saves both the last checkpoint and the best checkpoint, instead of only the last checkpoint
- most operations loading models from HuggingFace can now receive an authentication token (useful to access private repositories)
- support for remapping entity labels in Seq2SeqEvaluator (useful when predicted and reference label do not match exactly)
- easier initialization of PASpeakerDetector
Fixed
- medkit is now compatible with the latest (0.9) EDS-NLP
- custom attributes (DateAttribute, UMLSNormAttribute) don't have None as a value anymore