Deduce

Latest version: v3.0.3

Safety actively analyzes 722491 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 5

3.0.3

Added

- A cache_path option, to define the path for saving/loading the lookup structure cache. You should use this if your install directory is not writable.

Removed

- the `config_file` keyword, now replaced by `config` which accepts both filenames and dicts
- old lookup list names, e.g. `prefixes` now replaced by `prefix`
- annotator types `custom`, `regexp`, `token_pattern`, `dd_token_pattern` and `annotation_context`, all replaced by setting class directly as `annotator_type`
- everything in `deduce.pattern`, patient patterns now replaced by `PatientNameAnnotator`

3.0.2

Changed
- recognize 4+ spaces as a token, blocking annotations

3.0.1

Fixed
- a bug with packaging `base_config.json`

3.0.0

Added
- speed optimizations, ~250%
- pseudo-annotating eponymous diseases (e.g. Creutzfeldt-Jakob)
- `PatientNameAnnotator`, which replaces `deduce.pattern`
- a structured way for loading and building lookup structures (lists and tries), including caching
- `pre_match_words` for some regexp annotators, speeding up the annotating
- option to present a user config as dict (using `config` keyword)

Changed
- speedup for `TokenPatternAnnotator`
- some internals of `ContextPatternAnnotator`
- initials now detected by lookup list, rather than pattern
- redactor open and close chars from `<` `>` to `[` `]`, as previous chars caused issues in html (so deidentified text now shows `[PATIENT]`, `[LOCATIE]`, etc.)
- names of lookup structures to singular (`prefix`, rather than `prefixes`)
- `INSTELLING` tag to `ZIEKENHUIS` and `ZORGINSTELLING`
- refactored and simplified annotator loading, specifically the `annotator_type` config keyword now accepts references to classes (e.g `deduce.annotator.TokenPatternAnnotator`)
- renamed `interfix_with_capital` annotator to `interfix_with_name`

Deprecated
- the `config_file` keyword, now replaced by `config` which accepts both filenames and dicts
- old lookup list names, e.g. `prefixes` now replaced by `prefix`
- annotator types `custom`, `regexp`, `token_pattern`, `dd_token_pattern` and `annotation_context`, all replaced by setting class directly as `annotator_type`
- everything in `deduce.pattern`, patient patterns now replaced by `PatientNameAnnotator`

Removed
- automated coverage reporting on coveralls.io
- options `lowercase_lookup`, `lowercase_neg_lookup` for token patterns
- `utils.any_in_text`

Fixed
- some small additions/removals for specific lookup lists
- smaller bugs related to overlapping matches

2.5.0

Added
- the `RegexpPseudoAnnotator` component for filtering regexp matches based on preceding/following words
- a `prefix_with_interfix` pattern for names, detecting e.g. `Dr. van Loon`

Changed
- the age detection component, with improved logic and pseudo patterns
- annotations are no longer counted adjacent when separated by a comma
- streets are prioritized over names when merging overlapping annotations
- removed some false positives for postal codes ending in `gr` or `ie`
- extended the postbus pattern for `xx.xxx` format (old notation)
- some smaller optimizations and exceptions for institution, hospital, placename, residence, medical term, first name, and last name lookup lists

Fixed
- a bug with `BsnAnnotator` with non-digit characters in regexp

Deduce

Page 1 of 5

3.0.3

3.0.2

3.0.1

3.0.0

2.5.0

2.4.3

Page 1 of 5

Links

Releases