Tokenicer

Latest version: v0.0.4

Safety actively analyzes 723217 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

0.0.4

What's Changed

⚡ Now tokenicer instance dynamically inherits the native `tokenizer.__class__` of `tokenizer` passed in or loaded via our `Tokenicer.load()` api.
⚡ CI now tests tokenizers from `64` models

* fix mpt pad token bug by CL-ModelCloud in https://github.com/ModelCloud/Tokenicer/pull/24
* fix model_config bugs by CL-ModelCloud in https://github.com/ModelCloud/Tokenicer/pull/25
* test code clean up by CL-ModelCloud in https://github.com/ModelCloud/Tokenicer/pull/26
* Inherits PretrainedTokenizer by Qubitium in https://github.com/ModelCloud/Tokenicer/pull/28
* loop & test all models by CSY-ModelCloud in https://github.com/ModelCloud/Tokenicer/pull/30


**Full Changelog**: https://github.com/ModelCloud/Tokenicer/compare/v0.0.2...v0.0.4

0.0.3

What's Changed

Now tokenicer instance dynamically inherits the native `tokenizer.__class__` of `tokenizer` passed in or loaded via our `Tokenicer.load()` api.

* fix mpt pad token bug by CL-ModelCloud in https://github.com/ModelCloud/Tokenicer/pull/24
* fix model_config bugs by CL-ModelCloud in https://github.com/ModelCloud/Tokenicer/pull/25
* test code clean up by CL-ModelCloud in https://github.com/ModelCloud/Tokenicer/pull/26
* Inherits PretrainedTokenizer by Qubitium in https://github.com/ModelCloud/Tokenicer/pull/28


**Full Changelog**: https://github.com/ModelCloud/Tokenicer/compare/v0.0.2...v0.0.3

0.0.2

What's Changed


⚡ Auto-fix models not setting padding_token
⚡ Auto-Fix models released with wrong padding_token: many models incorrectly use eos_token as pad_token which leads to subtle and hidden errors in post-training and inference when batching is used which is almost always.
⚡ Compatible with all HF Transformers recognized tokenizers

* Auto fix pad token by CL-ModelCloud in https://github.com/ModelCloud/Tokenicer/pull/5
* Forward to Tokenizer by CL-ModelCloud in https://github.com/ModelCloud/Tokenicer/pull/6
* read requirements.txt in setup.py by CSY-ModelCloud in https://github.com/ModelCloud/Tokenicer/pull/7
* [CI] add tokenicer forward test by CL-ModelCloud in https://github.com/ModelCloud/Tokenicer/pull/10
* add unit tests by CSY-ModelCloud in https://github.com/ModelCloud/Tokenicer/pull/11
* refractor by Qubitium in https://github.com/ModelCloud/Tokenicer/pull/8
* add deepseek_v3 map by CL-ModelCloud in https://github.com/ModelCloud/Tokenicer/pull/15

New Contributors
* CSY-ModelCloud made their first contribution in https://github.com/ModelCloud/Tokenicer/pull/1
* Qubitium made their first contribution in https://github.com/ModelCloud/Tokenicer/pull/3
* CL-ModelCloud made their first contribution in https://github.com/ModelCloud/Tokenicer/pull/5

**Full Changelog**: https://github.com/ModelCloud/Tokenicer/commits/v0.0.2

Links

Releases

Has known vulnerabilities

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.