Nerblackbox

Latest version: v1.0.0

Safety actively analyzes 682387 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

1.0.0

Added
- Support for additional model architectures: RoBERTa, DeBERTa
- Documentation: reproduction of results

Changed
- Renamed class: Experiment -> Training
- Renamed training parameters: prune_ratio_train -> train_fraction (+ same for val & test)

Fixed
- Controlled exception in case of multiple GPUs ([6](https://github.com/flxst/nerblackbox/pull/6))
- Download data from LabelStudio ([9](https://github.com/flxst/nerblackbox/pull/9), [#10](https://github.com/flxst/nerblackbox/pull/10))

0.0.15

Added
- annotation tool integration (Doccano and LabelStudio)
- demonstration notebooks

Changed
- restructured docs
- reduced CLI to two commands (nerblackbox mlflow & nerblackbox tensorboard)
- dropped support for python version 3.11
- upgraded dependencies (fixing potential security vulnerabilities)

0.0.14

Not secure
Added
- Model: prediction on file
- Model: evaluation of any model on any dataset

Changed
- API: complete renewal using classes Store, Dataset, Experiment, Model
- Supported python versions: 3.8 to 3.11
- Dataset: no shuffling by default

Fixed
- Model: base model with NER classification head can be loaded

0.0.13

Not secure
Added
- NerModelPredict: GPU batch inference
- TextEncoder class for custom data preprocessing
- HuggingFace datasets integration: enable subsets
- HuggingFace datasets: support for sucx_ner

Changed
- NerModelPredict: improved inference time and data post-processing
- API: load best model of experiment directly (instead of via ExperimentResults)
- upgrade pytorch-lightning

0.0.12

Not secure
Added
- Adaptive fine-tuning
- Integration of HuggingFace Datasets
- Integration of raw (unpretokenized) data
- Integration of different annotation schemes and seamless conversion between them
- Option to specify experiments dynamically (instead of using a config file)
- Option to add special tokens
- New built-in dataset: Swe-NERC
- Use seeds for reproducibility

Changed
- Validation only on single metric (e.g. loss) during training
- Shuffling of all datasets (train, val, test)
- Results: epochs start counting from 1 instead of 0
- Results: compute standard version of macro-average, plus number of contributing classes
- Results: add precision and recall

Fixed
- All models that are based on WordPiece tokenizer work
- Early stopping: use last model instead of stopped epoch model

0.0.11

Not secure
Added
- NerModelPredict: predict on token or entity level
- Evaluation entity level: compute metrics for single labels
- Evaluation token level: confusion matrix
- Evaluation token & entity level: number of predicted classes

Changed
- Evaluation token level: use plain annotation scheme
- Migrate to pytorch-lightning==1.3.7, seqeval==1.2.2, mlflow==1.8.0

Page 1 of 2

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.