Added
`segments` now supports orthography profiles described by CSVW metadata.
Backward Incompatibilities
Orthography profiles and the input of `Tokenizer.__call__` is no longer Unicode normalized
by default. I.e. the user is responsible for making sure profiles and tokenization
input are normalized correspondingly. Alternatively, profile data can be normalized
by passing a `form` keyword argument when initializing a `Profile` instance. But
also in this case, tokenization input must be normalized by the user.
While this results in a more cumbersome API, it gives the user in full control, e.g.
to avoid incorrect segmentation when parts of decomposed graphemes are appended to
preseding grapheme clusters.