Code-tokenize

Latest version: v0.2.0

Safety actively analyzes 623541 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

0.2.0

Major API redesign
**code_tokenize in v0.2.0 makes now mainly use of the visitor pattern for parsing the AST**

Changes

* tokenize parses source code now by parsing the AST and traversing the AST via a visitor
* custom tokenizing visitors can be defined per language
* For Python, we correct the tokenization process: the indentation is now AST based computed
* Code is extensively tested in parsing large libraries (Python and Java)
* more languages are closer integrated

0.1.0

First main release of code.tokenize
First version to extend the functionality of the underlying AST parser.

Changes
* `tokenize` parses source code now with language specific configuration
* For **Python**, we automatically detect indentations and add special tokens
* Code is now extensively tested in parsing large libraries (Python and Java)
* Update documentation to make usage more easier

Minor features (still under test)
* AST path based detection of token types (detection of variable usages, definitions or function calls)
* Language specific configuration for Java

0.0.1

The first version of code(dot)tokenize.
The version introduces the following features:

- Introduction of Token API
- AST backed tokenization: The token interface enables easy access to the complete AST structure
- Fast AST parsing backend based on Tree-Sitter
- Full support of Tree-Sitter: Currently, all languages which are supported by Tree-Sitter can be tokenized
- Auto loading: The parser definition for each language is automatically downloaded

Minor features (still under test):
- Convention based statement head identification (the starting token of an statement)
- Convention based statement splitting

Links

Releases

Has known vulnerabilities

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.