Kazu

Latest version: v1.5.1

Safety actively analyzes 623760 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 3

1.5.1

Bugfixes

- Pinned scipy to <1.12.0 due to breaking API change.

1.5.0

Features

- Added new cleanup action: DropMappingsByParserNameRankAction
- Added new disambiguation strategy: PreferNearestEmbeddingToDefaultLabelDisambiguationStrategy.
- DefinedElsewhereInDocumentDisambiguationStrategy has slightly changed, so that it will only return mappings that were found elsewhere in the document, rather than the whole EquivalentIdSet where those ids were contained
- New disambiguation methodology GildaTfIdfDisambiguationStrategy.
- OpenTargetsTargetOntologyParser now has a biotype filter parameter.

Deprecations and Removals

- Deprecated `GildaUtils.replace_dashes` in favour of `GildaUtils.split_on_dashes_or_space`, as the latter improves efficiency in Kazu.
`GildaUtils.replace_dashes` will continue to work until kazu 1.6, but using it will produce a `DeprecationWarning`.
Please [open a GitHub issue](https://github.com/AstraZeneca/KAZU/issues/new) if you wish this to remain.

1.4.0

Features

- Added new curation_report.py to assist in upgrading ontologies between versions
- New disambiguation strategy to prefer mappings that have a default label that matches an entity.
- The OpenTargetsDiseaseOntologyParser has been heavily reworked, so that it uses the therapeutic_area concept to decide what records should be included. This has in turn yielded the subsets: measurement, medical_procedure, biological_process and phenotype. The measurement configuration is currently disabled as it requires heavy curation of the underlying strings. In addition, the OpenTargetsDiseaseOntologyParser now supports a custom ID grouping method, to make use of cross references.

Bugfixes

- MemoryEfficientStringMatchingStep now only produces a single entity per class where multiple curations exist with different cases.
- Previously, the `tested_dependencies.txt` file in the model packs included an editable install of kazu, which wasn't intended.
We now exclude kazu from that output.
- Speed up model pack builds for model packs using `ExplosionStringMatchingStep`, by fixing a bug that caused the parsers to be populated twice in this case.

Deprecations and Removals

- Removed pytorch-lightning as a dependency. The signatures of SapbertStringSimilarityScorer and TransformersModelForTokenClassificationNerStep have changed
- Renamed `create_phrasematchers_using_curations` method of `OntologyMatcher` to `create_phrasematchers`. The old name will continue to work until kazu 1.6, but using it will produce a `DeprecationWarning`.
- `MetadataDatabase.add_parser` now requires an `entity_class`.
This enables correct string normalisation in the `MappingStep` for the new disambiguation strategy.

1.3.2

Bugfixes

- Hits with scores of 0.0 are no longer returned by DictionaryIndex
- Pin lightning-utilities dependency, a new version of which completely broke the model inference, despite lightning itself being pinned (they didn't pin lightning-utilities appropriately in the version we're using).

1.3.1

Features

- Added methods to dataclasses that allow them to be deserialied from json.

Deprecations and Removals

- Renamed `SpacyToKazuObjectMapper` to `KazuToSpacyObjectMapper`.
The old name will continue to work until kazu 1.6, but using it will produce a `DeprecationWarning`
- `RulesBasedEntityClassDisambiguationFilterStep` no longer requires `parsers` or `other_entity_classes`.
It previously used these to construct the `entity_classes` argument of `KazuToSpacyObjectMapper.__init__`, but now we can just calculate which of these we really need from the class and mention rules passed to `RulesBasedEntityClassDisambiguationFilterStep.__init__`

1.3.0

Features

- CurationProcessor no longer tries to handle curations with INHERIT_FROM_SOURCE_TERM behaviour, as this was causing confusion and conflicts. This is now the responsibility of the caller.
- Updated ontologies for October 2023.

Bugfixes

- Fixed a bug in MemoryEfficientStringMatchingStep where caseinsensitive overlaps caused ontology info to be lost.

Page 1 of 3

Releases

Has known vulnerabilities

Kazu

Page 1 of 3

1.5.1

1.5.0

1.4.0

1.3.2

1.3.1

1.3.0

Page 1 of 3

Links

Releases