- Bugfixes in postprocessing - Inclusion of language information in the generated datasets
0.2.5
This release adds:
- A fixed caching mechanism that removes any possibility of caching conflicts
0.2.4
This release adds:
- Fix spacy multilingual models for extraction
0.2.3
This release adds:
- Use updated ai21 SDK - Use multilingual spacy model for non-supported languages - Bugfix: Correctly use each provider's API in AWS Bedrock
0.2.2
This release adds:
- Sentence rewriting extractor and packer to generate mixcase datasets. Contrary to gap and masking, a set of sentences of the documents are selected and the LLM has to rewrite them in its own words. - Argument validation in extractors. - Remove private methods from the documentation.
0.2.1
This release adds:
- Documentation: https://textmachina.readthedocs.io/en/latest/ - Documentation-related extras for developers in the `setup.py` - Fixes some names of functions that were incorrectly autocompleted.