Text-classification-baseline

Latest version: v0.1.6

Safety actively analyzes 626904 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 2

0.1.6

- fixed token frequency support (add token frequency support 85)
- fixed threshold selection for binary classification (add threshold selection for binary classification 86)

0.1.5

- added pymorphy2 lemmatization (81)
- added token frequency support (85)
- added threshold selection for binary classification (86)
- added arbitrary save folder name (83)

---

pymorphy2 lemmatization ([config.yaml](https://github.com/dayyass/text-classification-baseline#config))
yaml
preprocessing
(included in resulting model pipeline, so preserved for inference)
preprocessing:
lemmatization: pymorphy2

token frequency support
- `text_clf.token_frequency.get_token_frequency(path_to_config)` - get token frequency of **train dataset** according to the config file parameters

threshold selection for binary classification
- `text_clf.pr_roc_curve.get_precision_recall_curve(path_to_model_folder)` - get *precision* and *recall* metrics for precision-recall curve
- `text_clf.pr_roc_curve.get_roc_curve(path_to_model_folder)` - get *false positive rate (fpr)* and *true positive rate (tpr)* metrics for roc curve
- `text_clf.pr_roc_curve.plot_precision_recall_curve(precision, recall)` - plot *precision-recall curve*
- `text_clf.pr_roc_curve.plot_roc_curve(fpr, tpr)` - plot *roc curve*
- `text_clf.pr_roc_curve.plot_precision_recall_f1_curves_for_thresholds(precision, recall, thresholds)` - plot *precision*, *recall*, *f1-score* curves for probability thresholds

arbitrary save folder name ([config.yaml](https://github.com/dayyass/text-classification-baseline#config))
yaml
experiment_name: model

0.1.4

- fixed `load_20newsgroups.py` (65 71)
- added Makefile (71)
- added logging confusion matrix (72)
- replaced all "valid" occurrences with "test" (74)
- updated docstrings (77)
- changed python interface - train function returns model and target_names_mapping (78)

0.1.3

- added hyper-parameters tuning (58)

0.1.2

- fixed bug with multiple logging (55)

0.1.1

- added logging (43)
- added unittests (49)
- added CI with linter, tests, codecov (46 49)
- added docker (48)

Page 1 of 2

Releases

Has known vulnerabilities

Text-classification-baseline

Page 1 of 2

0.1.6

0.1.5

0.1.4

0.1.3

0.1.2

0.1.1

Page 1 of 2

Links

Releases