Ai21-tokenizer

Latest version: v0.9.0

Safety actively analyzes 623608 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 4

0.9.0

Feature

* feat: Jamba instruct tokenizer (84) ([`88ff9af`](https://github.com/AI21Labs/ai21-tokenizer/commit/88ff9aff504caa8928d68bcde4430c24fbbc23f1))

0.8.2

Chore

* chore(release): v0.8.2 [skip ci] ([`1146741`](https://github.com/AI21Labs/ai21-tokenizer/commit/11467416dd263b824f6a8711983eb5588fb037dc))

Ci

* ci: add Python 3.12 to test matrix (82)

* ci: add Python 3.12 to test matrix

* chore: use sentencepiece 0.2.0 or higher

* fix: update poetry.lock ([`8084117`](https://github.com/AI21Labs/ai21-tokenizer/commit/8084117c74813a99b79ecefd12888817470e1838))

Fix

* fix: docs (83) ([`c26949a`](https://github.com/AI21Labs/ai21-tokenizer/commit/c26949a62d5e612a7ff8132c6e6896b263be7b28))

Unknown

* Update issue templates ([`86ea6e7`](https://github.com/AI21Labs/ai21-tokenizer/commit/86ea6e79a5670c0e8049ac587ed1b5f4b8790ae9))

0.8.1

Chore

* chore(release): v0.8.1 [skip ci] ([`fcacbf8`](https://github.com/AI21Labs/ai21-tokenizer/commit/fcacbf89a590e47d6ac3b8d385c9a6628a3ef4b2))

Fix

* fix: re-ordered parameters in ctor to avoid a breaking change (79) ([`6c1b608`](https://github.com/AI21Labs/ai21-tokenizer/commit/6c1b6088c0914ffc77b53613047606c398e0557c))

0.8.0

Chore

* chore(release): v0.8.0 [skip ci] ([`c8b54df`](https://github.com/AI21Labs/ai21-tokenizer/commit/c8b54dff67c13587943f03198ec5a4e1dca7be88))

* chore(deps-dev): bump pytest from 7.2.1 to 7.4.4 (75)

Bumps [pytest](https://github.com/pytest-dev/pytest) from 7.2.1 to 7.4.4.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest/compare/7.2.1...7.4.4)

---
updated-dependencies:
- dependency-name: pytest
dependency-type: direct:development
update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <supportgithub.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]users.noreply.github.com>
Co-authored-by: asafgardin <147075902+asafgardinusers.noreply.github.com> ([`081dda3`](https://github.com/AI21Labs/ai21-tokenizer/commit/081dda305ebc33af78ad433d511bdef3d63e1307))

Feature

* feat: Add start_of_line to decode (77)

* feat: Add start_of_line param to decode

* test: added unittest with start_of_line=True and False ([`182a8d1`](https://github.com/AI21Labs/ai21-tokenizer/commit/182a8d10020862c233f7f67cddb965eee2398b98))

0.7.0

Chore

* chore(release): v0.7.0 [skip ci] ([`26f34b2`](https://github.com/AI21Labs/ai21-tokenizer/commit/26f34b290cdc6b5166872bd6b8af5ca53d736936))

Feature

* feat: Init tokenizer from filehandle (76)

* feat: allow creating JurassicTokenizer from model file handle

* fix: Add default for model_path and model_file_handle

* feat: Add JurassicTokenizer.from_file_path classmethod

* fix: remove model_path=None in JurassicTokenizer.from_file_handle

* fix: rename _assert_exactly_one to _validate_init and make it not static

* refactor: semantics

* test: Added tests

---------

Co-authored-by: Asaf Gardin <asafgai21.com> ([`dcb73a7`](https://github.com/AI21Labs/ai21-tokenizer/commit/dcb73a72348e576b06cd4a066e06141ceae37a44))

0.6.0

Chore

* chore(release): v0.6.0 [skip ci] ([`7b8348d`](https://github.com/AI21Labs/ai21-tokenizer/commit/7b8348d303eb54c4a75ca1c58be5c08c35ec3de8))

* chore: add test case for encode with is_start=False (74)

* chore: add test case for encode with is_start=False

* fix: split is_start=False to a different testcase ([`77c0a39`](https://github.com/AI21Labs/ai21-tokenizer/commit/77c0a39d1bcde81cc0166a512eb454dad6d3c569))

Feature

* feat: Add decode with offsets (73)

* feat: Add decode_with_offsets() to JurassicTokenizer

* refactor: remove kwargs from decode_with_offsets since it&39;s not used

* chore: Add unittest for decode and for offsets

* fix: test only decode_with_offsets

* fix: dummy for returned offsets in decode_with_offsets ([`a5a7bb4`](https://github.com/AI21Labs/ai21-tokenizer/commit/a5a7bb4b27fa4f74a0b1a1d6874599556a35c1c5))

* feat: Add the is_start parameter to JurassicTokenizer.encode() (72)

* feat: Add the is_start parameter to JurassicTokenizer.encode()

* refactor: take &39;is_start&39; from kwargs ([`296bda5`](https://github.com/AI21Labs/ai21-tokenizer/commit/296bda5578edd57ff58d6763b3ccd5b9ba709795))

Page 1 of 4

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.