This release improves sentence splitting for sentences ending in German closing quotation marks (“).
1.4.3
This is a bugfix release that fixes a bug that occured in 1.4.2 when using the option -e on some inputs containing control characters and other “nasty” characters.
1.4.2
Control characters and other “nasty” characters (soft hyphens and zero-width spaces) are removed from the input.
1.4.1
Added support for Unicode emoticons and various other Unicode symbols.
1.4.0
SoMaJo can now perform sentence splitting (using the new option --split_sentences).
1.3.1
SoMaJo is now hosted on Github and the changes made in this version reflect that change.