Latest version: v0.1.8
The information on this page was curated by experts in our Cybersecurity Intelligence Team.
utoken is a universal tokenizer (multilingual word segmenter) that divides text into words, punctuation and special tokens such as numbers, URLs, XML tags, email-addresses and hashtags. It comes with a companion detokenizer.
No known vulnerabilities found