Oger

Latest version: v1.5

Safety actively analyzes 706267 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 3

1.1

- parameter additions:
* multiple postfilters allowed
* new parameter `field-names` interacts with ex-termlist parameter `extra-fields`
* new output formats *bioc_json* and *pubanno_json*
- REST service:
* improved API (more consistent)
* fetch/upload requests accept input/output parameters (query params)
* postfilters can be selected through query params

1.0

- parameter changes:
* more intuitive input parameters: `iter-mode` distinguishes *document* and *collection*, while `pointer-type` is *glob* or *id*
* parameter `elements` (positional in CLI) renamed to `pointer`
* new parameters `ignore-load-errors` and `sentence-split`
* BioC metadata specified using JSON
- OGER refactored to an importable Python package; can be installed with pip
- web interface: multiple annotators (can be sent from the BTH)
- termlist can be provided over HTTP/HTTPS/FTP
- more and parametrisable normalisation methods (choose stemmer/Unicode normalisation)
- BioC loader preserves \<infon\> elements at all levels
- various minor improvements and bugfixes

0.8

- changed and extended the syntax of the REST API (the old one still works)
- added a browser interface for the REST API
- a few bugfixes

0.7

- added a test suite
- PMC now works without crashing the runtime
- new input sources: *becalmabstracts* and *becalmpatents* (requests to BeCalm API)
- new output formats: *becalm_tsv* and *becalm_json*
- various bugfixes
- many improvements of the server mode

0.6

- new output format: ODIN XML (iat2)
- options for printing attributes in Brat output
- new example filter: suppress submatches
- allow multiple entity recognisers (entails some changes to the option naming)
- optionally do locally-cached abbreviation detection
- new normalisation method: stemming (Lancaster)
- start/stop functionality for the RESTful server
- numerous bugfixes and minor optimisations

0.5

- new input format: Pubmed Central full-text through the efetch API
- Brat annotation enriched with the rest of the termlist fields, using the "Annotator notes" tag
- dropped PyBioC dependency; BioC XML is now directly parsed/written
- collection mode is now also available for the input format "pubmed" (efetch)
- new feature: postfilter hook (specify a Python function that modifies the annotated articles before writing)
- new feature: fallback format (eg. try to use on-disk files first, but for those unavailable fall back to a pubmed request)
- more NLP methods registered: RegexTokenizer (for output filters), greektranslit (ER normalisation)
- and various minor extensions and bugfixes (see the commit messages)

Page 2 of 3

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.