Bugfix: Sentence boundaries are correctly recognized when reading an XML file.
1.3.0
- SoMeWeTa has now XML support. To tag an XML file, use the option -x/--xml. It is assumed that each XML tag is on a separate line. - The implementation of the beam search algorithm has been slightly improved.
1.2.0
It is now possible to use the option --ignore-tag to specify a tag that will not be learned during training and that will be ignored during evaluation. Use case: Partially annotated data that use a pseudo-tag for tokens without annotation.
1.1.2
Bugfix: Using the --parallel option does no longer change the order of the sentences.
1.1.1
This version fixes a bug that made it impossible to use the --parallel option when reading from STDIN.
1.1.0
- Bugfix: Removed trailing space from last tag in sentence. - The new option --parallel makes it possible to use a pool of worker processes to speed up tagging. - We also print a log message that indicates tagging speed.