Product Research Enterprise Plans Docs

Tokenizer

Latest version: v3.4.5

Safety actively analyzes 724051 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 8

3.3.3

* Better support for token-level errors

3.3.2

* Internal refactoring
* Fixes in paragraph handling

3.3.0

* Fixed bug where opening quotes following beginning-of-paragraph markers were incorrectly recognized and normalized.

3.2.0

* Numbers and amounts that consist exclusively of alphabetic words (*sjö hundruð*) are now returned as the original ``TOK.WORD`` tokens (*sjö* and *hundruð*), not coalesced into ``TOK.NUMBER``/``TOK.AMOUNT``/etc. tokens as before.

3.1.2

* Changed paragraph markers to be `[[` and `]]`, i.e. without spaces, for better accuracy in character offset calculations.

3.1.1

* Minor fix; `Tok.from_token()` added

Page 2 of 8

Releases

Has known vulnerabilities

Previous Next

Tokenizer

Page 2 of 8

3.3.3

3.3.2

3.3.0

3.2.0

3.1.2

3.1.1

Page 2 of 8

Links

Releases