Dialectid

Latest version: v0.0.5

Safety actively analyzes 681812 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

0.0.5

DenseBoW can encode a text in a matrix where the number of columns corresponds to the tokens, and the rows are the decision values of Support Vector Machines trained to identify the token represented in each row.

0.0.5b

DenseBoW can encode a text in a matrix where the number of columns corresponds to the tokens, and the rows are the decision values of Support Vector Machines trained to identify the token represented in each row.

0.0.4

The version includes the subwords model. This is the default in `DialectId`.

0.0.3

It addresses a typo issue in French countries.

0.0.2

It includes the class `DialectId` to identify the dialect (country) from a text, given the language.

0.0.1

The first version of dialectid. It has a Bag of Word (BoW) model where the weights were estimated in 4 million ($2^{22}$) tweets uniformly selected from the Spanish countries.

data
The release is to have a place to store the data of the dialectid.

Links

Releases

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.