Span-marker

Latest version: v1.5.0

Safety actively analyzes 626691 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 3

1.5.0

Added

- Added support for BILO tagging schemes.

Changed

- Changed the error when an empty sentence is provided to the tokenizer.
- Using spaCy `nlp.pipe` now processes texts sentence-wise, just like for `nlp(...)`.

Fixed

- No longer override `language` metadata from the dataset if the language was also set manually via `SpanMarkerModelCardData`.
- No longer crash on `predict` with `ValueError: Failed to concatenate on axis=1 ...` if the first sentence in a list of sentences is just one word.

1.4.0

Added

- Added `SpanMarkerModel.generate_model_card()` method to get a model card string.
- Added `SpanMarkerModelCardData` that should be passed to `SpanMarkerModel.from_pretrained` with additional information like
- `language`, `license`, `model_name`, `model_id`, `encoder_name`, `encoder_id`, `dataset_name`, `dataset_id`, `dataset_revision`.
- Added `transformers` `pipeline` support, e.g. `pipeline(task="span-marker", model="tomaarsen/span-marker-mbert-base-multinerd")`.

Changed

- Heavily improved automatic model card generated.
- Evaluating outside of training now returns per-label outputs instead of only "overall" F1, precision and recall.
- Warn if the used tokenizer distinguishes between punctuation directly attached to a word and punctuation separated from a word by a space.
- If so, then inference of that model will require the punctuation to be split from the words.
- Improve label normalization speed.
- Allow you to call SpanMarkerModel.from_pretrained with a pre-initialized SpanMarkerConfig.

Deprecated

- Deprecated Python 3.7.

Fixed

- Fixed tokenization mismatch between training and inference for XLM-RoBERTa models: allows for normal inference of those models.
- Resolve niche bug when TrainingArguments are not provided.

1.3.0

Added

- Added an `overwrite_entities` parameter to the spaCy pipeline component to allow for overwriting spaCy entities.
- Added `.pipe()` method to spaCy integration to allow for batched inference.

Changed

- Stop overwriting spaCy entities by default.

1.2.5

Fixed

- Allow for immutable `TrainingArguments` from newer `transformers` release.

1.2.4

Fixed

- Resolved broken license information.

1.2.3

Fixed

- Fix crash in spaCy inference when using subsequent whitespace.

Page 1 of 3

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.