Added
- a component for deidentifying BSN-nummers
Changed
- updated dependencies
- by default, deduce now recognizes and tags bsn nummers
- by default, deduce now recognizes all other 7+ digit numbers as identifiers
- improved regular expressions for e-mail address and url matching, with separate tags
- logic for detecting phone numbers (improvements for hyphens, whitespaces, false positive identifiers)
- improved regular expression for age matching
- date detection logic:
- now only recognizes combinations of day, month and year (day/month combinations caused many false positives)
- detects year-month-day format in addition to (day-month-year)
- loading a custom config now only replaces the config options that are explicitly set, using defaults for those not included in the custom config
Deprecated
- backwards compatibility, which was temporary added to transition from v1 to v2
Removed
- a separate patient identifier tag, now superseded by a generic tag
- detection of day/month combinations for dates, as this caused many false positives (e.g. lab values, numeric scores)
Fixed
- annotations can no longer be counted as adjacent when separated by newline or tab (and will thus not be merged)