Edsnlp

Latest version: v0.16.0

Safety actively analyzes 723685 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 4 of 8

0.10.0

Added

- New add unified `edsnlp.data` api (json, brat, spark, pandas) and LazyCollection object
to efficiently read / write data from / to different formats & sources.
- New unified processing API to select the execution execution backends via `data.set_processing(...)`
- The training scripts can now use data from multiple concatenated adapters
- Support quantized transformers (compatible with multiprocessing as well !)

Changed

- `edsnlp.pipelines` has been renamed to `edsnlp.pipes`, but the old name is still available for backward compatibility
- Pipes (in `edsnlp/pipes`) are now lazily loaded, which should improve the loading time of the library.
- `to_disk` methods can now return a config to override the initial config of the pipeline (e.g., to load a transformer directly from the path storing its fine-tuned weights)
- The `eds.tokenizer` tokenizer has been added to entry points, making it accessible from the outside
- Deprecate old connectors (e.g. BratDataConnector) in favor of the new `edsnlp.data` API
- Deprecate old `pipe` wrapper in favor of the new processing API

Fixed

- Support for pydantic v2
- Support for python 3.11 (not ci-tested yet)

0.10.0beta1

Large refacto of EDS-NLP to allow training models and performing inference using PyTorch
as the deep-learning backend. Rather than a mere wrapper of Pytorch using spaCy, this is
a new framework to build hybrid multi-task models.

To achieve this, instead of patching spaCy's pipeline, a new pipeline was implemented in
a similar fashion to aphp/edspdf12. The new pipeline tries to preserve the existing API,
especially for non-machine learning uses such as rule-based components. This means that
users can continue to use the library in the same way as before, while also having the option to train models using PyTorch. We still
use spaCy data structures such as Doc and Span to represent the texts and their annotations.

Otherwise, changes should be transparent for users that still want to use spacy pipelines
with `nlp = spacy.blank('eds')`. To benefit from the new features, users should use
`nlp = edsnlp.blank('eds')` instead.

Added

- New pipeline system available via `edsnlp.blank('eds')` (instead of `spacy.blank('eds')`)
- Use the confit package to instantiate components
- Training script with Pytorch only (`tests/training/`) and tutorial
- New trainable embeddings: `eds.transformer`, `eds.text_cnn`, `eds.span_pooler`
embedding contextualizer pipes
- Re-implemented the trainable NER component and trainable Span qualifier with the new
system under `eds.ner_crf` and `eds.span_classifier`
- New efficient implementation for eds.transformer (to be used in place of
spacy-transformer)

Changed

- Pipe registering: `Language.factory` -> `edsnlp.registry.factory.register` via confit
- Lazy loading components from their entry point (had to patch spacy.Language.__init__)
to avoid having to wrap every import torch statement for pure rule-based use cases.
Hence, torch is not a required dependency

0.9.2

Changed

- Fix matchers to skip pipes with assigned extensions that are not required by the matcher during the initialization

0.9.1

Changed

- Improve negation patterns
- Abstent disorders now set the negation to True when matched as `ABSENT`
- Default qualifier is now `None` instead of `False` (empty string)

Fixed

- `span_getter` is not incompatible with on_ents_only anymore
- `ContextualMatcher` now supports empty matches (e.g. lookahead/lookbehind) in `assign` patterns

0.9.0

Added

- New `to_duration` method to convert an absolute date into a date relative to the note_datetime (or None)

Changes

- Input and output of components are now specified by `span_getter` and `span_setter` arguments.
- :boom: Score / disorders / behaviors entities now have a fixed label (passed as an argument), instead of being dynamically set from the component name. The following scores may have a different name
than the current one in your pipelines:
* `eds.emergency.gemsa` → `emergency_gemsa`
* `eds.emergency.ccmu` → `emergency_ccmu`
* `eds.emergency.priority` → `emergency_priority`
* `eds.charlson` → `charlson`
* `eds.elston_ellis` → `elston_ellis`
* `eds.SOFA` → `sofa`
* `eds.adicap` → `adicap`
* `eds.measuremets` → `size`, `weight`, ... instead of `eds.size`, `eds.weight`, ...
- `eds.dates` now separate dates from durations. Each entity has its own label:
* `spans["dates"]` → entities labelled as `date` with a `span._.date` parsed object
* `spans["durations"]` → entities labelled as `duration` with a `span._.duration` parsed object
- the "relative" / "absolute" / "duration" mode of the time entity is now stored in
the `mode` attribute of the `span._.date/duration`
- the "from" / "until" period bound, if any, is now stored in the `span._.date.bound` attribute
- `to_datetime` now only return absolute dates, converts relative dates into absolute if `doc._.note_datetime` is given, and None otherwise

Fixed

- `export_to_brat` issue with spans of entities on multiple lines.

0.8.1

Fix release to allow installation from source

Page 4 of 8

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.