Improvements
- changed SEETM tokenizer persist file extension to `txt` from `json` due to a change in the tokenizer
Bugfixes
- fixed a bug in `token-to-token mapper` (added sorting by decending order of token length and allowed only full token mappings). The bug caused shorted patterns to be replaced upfront, discarding longer patterns and also wrongly mapped partial tokens before the fix.