Ontoma

Latest version: v1.1.2

Safety actively analyzes 675388 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 2

1.0.0

OnToma has been rewritten with a focus on simplicity and mapping reliability. As a new major version, this release introduces some breaking changes to the CLI and Python interfaces, as well as major updates to the processing logic. Most importantly, the mapping results can be expected to change a lot.

Please read these release notes carefully before you consider upgrading. Bug reports and feedback on this release are especially highly appreciated. Please direct them to dataopentargets.org.

Mapping approach changes
OnToma has two operation modes, which are now clearly separated based on input type. For **ontology** input (e.g. `OMIM:102900`), OnToma attempts the following steps to map to EFO:

1. Exact identifier match from EFO;
2. Match terms by cross-references (`hasDbXref`);
3. Mapping from the manual cross-reference database;
4. Request through OxO with a distance of 2.

For **string** input (e.g. `asthma`), the following steps are attempted:

5. Exact name match from EFO;
6. Exact synonym (`hasExactSynonym`);
7. Mapping from the manual string-to-ontology database;
8. High confidence mapping from ZOOMA with default parameters.

Expected changes in the mapping results
All of the approaches listed in the previous section generate mappings which we consider to be of high quality, and they can be used in automated workflows straight out of OnToma. However, this is achieved at a cost of removing some low confidence approaches, such as fuzzy OLS lookup.

Our _preliminary_ benchmarks, comparing the previous OnToma version (v0.0.18) to this release (v1.0.0), demonstrated the following approximate pattern:
* Sensitivity—percentage of valid input mappings which are discovered—dropped from 96% to 61%.
* At the same time, precision—the percentage of the mappings in OnToma output which are actually correct—rose from 75% to 97%.

Hence, after upgrading a significant drop in the number of the results is expected; however, the remaining results will be of significantly higher quality, which we believe is much more important in nearly all applications. We intend to work on increasing sensitivity in further releases.

Other operation changes
The CLI and Python interfaces have been simplified. The `verbose` and `suggest` flags have been removed (they might be reimplemented in a more consistent way in future releases).

Importantly, where multiple EFO terms match equally well from the single processing step, OnToma will now return multiple hits per query. (Previously, only one hit was selected, in a mostly random fashion.)

Each OnToma result consists of multiple fields.
* In Python API they are accessed as result object attributes: `OnToma().find_term('astma').id_ot_schema` will contain `EFO_0000270`.
* In CLI the list of fields to output can be configured via the `--columns` flag.

Manually curated mapping sources
Two central resources are currently being set up to store all manually curated [ontology to EFO (step 3)](https://github.com/opentargets/mappings/blob/master/manual_xref.tsv) and [string to EFO (step 7)](https://github.com/opentargets/mappings/blob/master/manual_string.tsv) mappings. External OnToma users are encouraged to contribute to these resources as well. (More information about that will come in future releases.)

Changes to ontology handling
A new module, `ontoma.ontology`, was implemented to facilitate conversion between different ways to represent ontology identifiers. For example, `ORDO_140162`, `ORPHA:140162`, `Orphanet:140162`, and `http://www.orpha.net/ORDO/Orphanet_140162` all represent the same term. The module implements an algorithm which converts all possible representations into the stable internal normalised representation to make direct comparisons possible.

The output of OnToma always follows the format specified in the [Open Targets JSON schema](https://github.com/opentargets/json_schema/blob/309105a93910016568ee6673bdb84625acddd4eb/opentargets.json#L1148), for example, `Orphanet_140162`. This means that you can plug in the output of OnToma directly into the evidence strings.

EFO OT slim is now loaded and parsed more consistently from the OWL file. There is a new option to cache this data to speed up OnToma initialisation in subsequent runs.

Additionally, you can now specify a particular EFO version to use. The version which is used by default in this release is pinned to v3.31.0.

Technical changes
The documentation has been migrated to ReadTheDocs and rewritten. RST build and configuration files have been updated and simplified.

Python 3.7+ is now required and consistently used throughout the code base. Installation has been simplified using pure PIP. The tests and CircleCI configuration have been updated to reflect all of the changes.

Page 2 of 2

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.