Bpetokenizer

Latest version: v1.2.1

The latest version of bpetokenizer with no known security vulnerabilities is 1.2.1. We recommend installing version 1.2.1.

The information on this page was curated by experts in our Cybersecurity Intelligence Team.

Latest release
v1.2.1 at June 6, 2024
License
MIT (MIT License)

Description

A Byte Pair Encoding (BPE) tokenizer, which algorithmically follows along the GPT tokenizer(tiktoken), allows you to train your own tokenizer. The tokenizer is capable of handling special tokens and uses a customizable regex pattern for tokenization(includes the gpt4 regex pattern). supports `save` and `load` tokenizers in the `json` and `file` format. The `bpetokenizer` also supports [pretrained](bpetokenizer/pretrained/) tokenizers.

Resources

Vulnerabilities

See all vulnerabilities

No known vulnerabilities found

Versions (9)

See all versions

Has known vulnerabilities

  • 1.2.1
  • 1.2.0
  • 1.0.32
  • 1.0.31
  • 1.0.4
  • 1.0.3
  • 1.0.2
  • 1.0.1
  • 1.0.0