------------------
- Upgraded syntax to Python 3.6 (mostly Format-Strings) using pyuprade (PR2136 by Sebastian Wagner).
Core
- `intelmq.lib.upgrades`:
- Refactor upgrade functions global configuration handling removing the old-style defaults configuration (PR2058 by Sebastian Wagner).
- Pass version history as parameter to upgrade functions (PR2058 by Sebastian Wagner).
- `intelmq.lib.message`:
- Fix and pre-compile the regular expression for harmonization key names and also check keys in the `extra.` namespace (PR2059 by Sebastian Wagner, fixes 1807).
- `intelmq.lib.bot.SQLBot` was replaced by an SQLMixin in `intelmq.lib.mixins.SQLMixin`. The Generic DB Lookup Expert bot and the SQLOutput bot were updated accordingly.
- Added support for MSSQL (PR2171 by Karl-Johan Karlsson).
- Added optional reconnect delay parameter (PR2171 by Karl-Johan Karlsson).
- Added an ExpertBot class - it should be used by all expert bots as a parent class
- Introduced a module for IntelMQ related datatypes `intelmq.lib.datatypes` which for now only contains an Enum listing the four bot types
- Added a `bottype` attribute to CollectorBot, ParserBot, ExpertBot, OutputBot
- Introduces a module for IntelMQ processmanagers. The processmanagers were up until now part of the intelmqct script.
They now reside in `intelmq.lib.processmanager` which also contains an interface definition the processmanager implementations must adhere to.
Both the processmanagers and the `intelmqctl` script were cleaned up a bit.
The `LogLevel` and `ReturnType` Enums were added to `intelmq.lib.datatypes`.
- `intelmq.lib.bot`:
- Enhance behaviour if an unconfigured bot is started (PR2054 by Sebastian Wagner).
- Fix line recovery and message dumping of the `ParserBot` (PR2192 by Sebastian Wagner).
- Previously the dumped message was always the last message of a report if the report contained multiple lines leading to data-loss.
- Fix crashing at start in multithreaded bots (PR2236 by DigitalTrustCenter).
- Added `default_fields` parameter to `ParserBot` (PR2293 by Filip Pokorný)
- `intelmq.lib.pipeline`:
- Changed `BRPOPLPUSH` to `BLMOVE`, because `BRPOPLPUSH` has been marked as deprecated by redis in favor of `BLMOVE` (PR2149 and PR2240 by Sebastian Waldbauer and Sebastian Wagner, fixes 1827, 2233).
- `intelmq.lib.utils`:
- Added wrapper `resolve_dns` for querying DNS, with the support for recommended methods from `dnspython` package in versions 1 and 2.
- Moved line filtering inside `RewindableFileHandle` for easier handling and limiting number of temporary objects.
- `intelmq.lib.harmonization`:
- Fixed DateTime handling of naive time strings (previously assumed local timezone, now assumes UTC) (PR2279 by Filip Pokorný, fixes 2278)
- Removes `tzone` argument from `DateTime.from_timestamp` and `DateTime.from_epoch_millis`
- `DateTime.from_timstamp` now also allows string argument
- Removes `pytz` global dependency
- Removed support for Python 3.6, including removing conditional dependencies and updating syntax to use features from newest versions. (fixes [2272](https://github.com/certtools/intelmq/issues/2272))
Development
- Removed Python 3.6 from CI.
- Enabled tests against Python 3.11.
Bots
- Set the parent class of all bots to the correct bot class
Collectors
- `intelmq.bots.collectors.mail._lib`:
- Add support for unverified SSL/STARTTLS connections (PR2055 by Sebastian Wagner).
- Fix exception handling for aborted IMAP connections (PR2187 by Sebastian Wagner).
- `intelmq.bots.collectors.blueliv`: Fix Blueliv collector requirements (PR2161 by Gethvi).
- `intelmq.bots.collectors.github_api._collector_github_api`: Added personal access token support (PR2145 by Sebastian Waldbauer, fixes 1549).
- `intelmq.bots.collectors.file.collector_file`: Added file lock support, no more race conditions (PR2147 by Sebastian Waldbauer, fixes 2128)
- `intelmq.bots.collectors.shadowserver.collector_reports_api.py`: Added file_format option to download reports in CSV format for better performance (PR2246 by elsif2)
Parsers
- `intelmq.bots.parsers.alienvault.parser_otx`: Save CVE data in `extra.cve` instead of `extra.CVE` due to the field name restriction on lower-case characters (PR2059 by Sebastian Wagner).
- `intelmq.bots.parsers.anubisnetworks.parser`: Changed field name format from `extra.communication.http.x_forwarded_for_1` to `extra.communication.http.x_forwarded_for_1` due to the field name restriction on alphanumeric characters (PR2059 by Sebastian Wagner).
- `intelmq.bots.parsers.dataplane.parser`:
- Add support for additional feeds (PR2102 by Mikk Margus Möll).
- DNS Recursion Desired
- DNS Recursion Desired ANY
- DNS Version
- Protocol 41
- SMTP Greet
- SMTP Data
- Telnet Login
- VNC/RFB Login
- Fix event object creation (PR2298 by DigitalTrustCenter).
- Removed `intelmq.bots.parsers.malc0de`: this bot was marked as deprecated and removed from feed due to offline status (PR2184 by Tamas Gutsohn, fixes 2178).
- `intelmq.bots.parsers.microsoft.parser_ctip`:
- New parameter `overwrite` (PR2112 by Sebastian Wagner, fixes 2022).
- Fix handling of field `Payload.domain` if it contains the same IP address as `Payload.serverIp` (PR2144 by Mikk Margus Möll and Sebastian Wagner).
- Handle Payload field with non-base64-encoded JSON content and numbered dictionaries (PR2193 by Sebastian Wagner)
- `intelmq.bots.parsers.shodan.parser` (PR2117 by Mikk Margus Möll):
- Instead of keeping track of `extra.ftp.<something>.parameters`, FTP parameters are collected together into `extra.ftp.features` as a list of said features, reducing field count.
- Shodan field `rsync.modules` is collected.
- Conversion functions can raise `NoValueException` with a string argument to signify that the conversion would not succeed, such as in the case of a single IP address being given in hostnames, which would then be passed into `source.reverse_dns and` fail to validate as a FQDN.
- Variable `_common_keys` is moved out of the class.
- `_dict_dict_to_obj_list` is introduced, for converting a string-to-dict mapping into a list of dicts with the previous key as an attribute of the dict; this can be useful for preventing issues where, when feeding the data into aggregating tools, you'd end up with many more fields than necessary, e.g `vulns.CVE-2010-0001.cvss`, `CVE-2010-0002.cvss` etc.
- `_get_first` to get the first item from a list, with `NoValueException` raised on empty lists.
- `_get_first_hostname` to handle the first valid FQDN from a list of hostnames for hostnames in the Shodan banner, if there is one, and gives `NoValueException` otherwise.
- `ssl.cert.serial` and `ssl.dhparams.generator`, which may return both integers and strings, are converted to strings.
- Changes to method `apply_mapping`, such as reducing needless loop iterations, removing a big try-except, and adding the `NoValueException` handling described above.
- Stops falsy values (False, 0) besides None from being filtered out.
- `intelmq.bots.parsers.shadowserver._config`:
- Added support for `Accessible AMQP`, `Device Identification Report` (IPv4 and IPv6) (PR2134 by Mateo Durante).
- Added file name mapping for `SSL-POODLE-Vulnerable-Servers IPv6` (file name `scan6_ssl_poodle`) (PR2134 by Mateo Durante).
- Added `Malware-URL`, `Sandbox-Connection`, `Sandbox-DNS`, `Accessible-AMQP`, `Open-AnonymouIs-MQTT`, `Accessible-QUIC`, `Accessible-SSH`, `SYNful-Knock`, and `Special` (PR2227 by elsif2)
- Removed legacy reports `Amplification-DDoS-Victim`, `CAIDA-IP-Spoofer`, `Darknet`, `Drone`, `Drone-Brute-Force`, `IPv6-Sinkhole-HTTP-Drone`, `Microsoft-Sinkhole`, and `Sinkhole-HTTP-Drone` (PR2227 by elsif2).
- Users storing events in a database should be aware that field names and types have been updated (PR2227 by elsif2).
- Corrected "Accessible-AMQP" message_length type (int) and added "STUN" support (PR2235 by elsif2).
- Added amplification factor to UDP scan reports (PR2238 by elsif2).
- Added version and build_date to "Vulnerable-HTTP" report (PR2238 by elsif2).
- The following field types have been standardized across all Shadowserver reports (PR2246 by elsif2):
destination.fqdn (validate_fqdn)
destination.url (convert_http_host_and_url)
extra.browser_trusted (convert_bool)
extra.duration (convert_int)
extra.end_time (convert_date_utc)
extra.freak_vulnerable (convert_bool)
extra.ok (convert_bool)
extra.password (validate_to_none)
extra.ssl_poodle (convert_bool)
extra.status (convert_int)
extra.uptime (convert_int)
extra.version (convert_to_none)
source.network (validate_network)
- The following report field names have changed to better represent their values:
scan_rsync:extra.password renamed to extra.has_password
scan_elasticsearch:status renamed to http_code
- Added `Accessible-HTTP-proxy` and `Open-HTTP-proxy` (PR2246 by elsif2).
- Added http_agent to the `Honeypot-DDoS` report and added the `DDoS-Participant` report (PR2303 by elsif2)
- Added `Accessible-SLP`, `IPv6 Accesssible-SLP`, `IPv6-DNS-Open-Resolvers`, and `IPv6-Open-LDAP-TCP` reports (PR2311 by elsif2)
- Standardized response_length to response_size in `Accessible-ICS` and `Open-MSSQL` (PR2311 by elsif2)
- `intelmq.bots.parsers.cymru.parser_cap_program`: The parser mapped the hostname into `source.fqdn` which is not allowed by the IntelMQ Data Format. Added a check (PR2215 by Sebastian Waldbauer, fixes 2169)
- `intelmq.bots.parsers.generic.parser_csv`:
- Use RewindableFileHandle to use the original current line for line recovery (PR2192 by Sebastian Wagner).
- Recovering CSV lines preserves the original line ending (PR2280 by Kamil Mankowski, fixes [1597](https://github.com/certtools/intelmq/issues/1597))
- `intelmq.bots.parsers.autoshun.parser`: Removed, as the feed is discontinued (PR2214 by Sebastian Waldbauer, fixes 2162).
- `intelmq.bots.parsers.openphish.parser_commercial`: Refactored complete code (PR2160 by Filip Pokorný).
- Fixes wrong mapping of `host` field to `source.fqdn` when the content was an IP address.
- Adds newly added fields in the feed.
- `intelmq.bots.parsers.phishtank.parser`: Refactored code (PR2270 by Filip Pokorný)
- Changes feed URL to JSON format (contains more information). The URL needs to by manually updated in the configuration!
- Adds fields from the JSON feed.
- `intelmq.bots.parsers.dshield.parser_domain`: Has been removed, due to the feed is discontinued. (PR2276 by Sebastian Waldbauer)
- `intelmq.bots.parsers.abusech.parser_ip`: Removed (PR2268 by Filip Pokorný).
- `intelmq.bots.parsers.abusech.parser_domain`: Removed (PR2268 by Filip Pokorný).
- `intelmq.bots.parsers.abusech.parser_feodotracker`: Added new parser bot (PR2268 by Filip Pokorný)
- Changes feed URL to JSON format (contains more information).
- Adds fields from the JSON feed.
- `intelmq.bots.parsers.generic.parser_csv`: Parameter `type` is deprecated, `default_fields` should be used. (PR2293 by Filip Pokorný)
- `intelmq.bots.parsers.generic.parser_csv`: Parameter `skip_header` now allows also integer as a fixed number of lines to skip. (PR2313 by Filip Pokorný)
- `intelmq.bots.parsers.taichung.parser`: Removed (PR2266 by Filip Pokorný)
Experts
- `intelmq.bots.experts.domain_valid`: New bot for checking domain's validity (PR1966 by Marius Karotkis).
- `intelmq.bots.experts.truncate_by_delimiter.expert`: Cut string if its length is higher than a maximum length (PR1967 by Marius Karotkis).
- `intelmq.bots.experts.remove_affix`: Remove prefix or postfix strings from a field (PR1965 by Marius Karotkis).
- `intelmq.bots.experts.asn_lookup.expert`: Fixes update-database script on the last few days of a month (PR2121 by Filip Pokorný, fixes 2088).
- `intelmq.bots.experts.threshold.expert`: Correctly use the standard parameter `redis_cache_ttl` instead of the previously used parameter `timeout` (PR2155 by Karl-Johan Karlsson).
- `intelmq.bots.experts.jinja2.expert`: Lift restriction on requirement jinja2 < 3 (PR2158 by Sebastian Wagner).
- `intelmq.bots.experts.asn_lookup.expert`, `intelmq.bots.experts.domain_suffix.expert`, `intelmq.bots.experts.maxmind_geoip.expert`, `intelmq.bots.experts.recordedfuture_iprisk.expert`, `intelmq.bots.experts.tor_nodes.expert`: New parameter `autoupdate_cached_database` to disable automatic updates (downloads) of cached databases (PR2180 by Sebastian Wagner).
- `intelmq.bots.experts.url.expert`: New bot for extracting additional information from `source.url` and/or `destination.url` (PR2315 by Filip Pokorný).
Outputs
- Removed `intelmq.bots.outputs.postgresql`: this bot was marked as deprecated in 2019 announced to be removed in version 3 of IntelMQ (PR2045 by Birger Schacht).
- Added `intelmq.bots.outputs.rpz_file.output` to create RPZ files (PR1962 by Marius Karotkis).
- Added `intelmq.bots.outputs.bro_file.output` to create Bro intel formatted files (PR1963 by Marius Karotkis).
- `intelmq.bots.outputs.templated_smtp.output`:
- Add new function `from_json()` (which just calls `json.loads()` in the standard Python environment), meaning the Templated SMTP output bot can take strings containing JSON documents and do the formatting itself (PR2120 by Karl-Johan Karlsson).
- Lift restriction on requirement jinja2 < 3 (PR2158 by Sebastian Wagner).
- `intelmq.bots.outputs.sql`:
- For PostgreSQL, escape Nullbytes in text to prevent "unsupported Unicode escape sequence" issues (PR2223 by Sebastian Wagner, fixes 2203).
Documentation
- Feeds: Add documentation for newly supported dataplane feeds, see above (PR2102 by Mikk Margus Möll).
- Installation: Restructured the whole document to make it clearer and straight-forward (PR2113 by Sebastian Wagner).
- Add workaround for https://github.com/sphinx-doc/sphinx/issues/10701 (PR#2225 by Sebastian Wagner, kudos yarikoptic, fixes 2224).
- Fix wrong operator for list-contains-value operation in sieve expert documentation (PR2256 by Filip Pokorný).
- Added documentation on `default_fields` parameter (PR2293 by Filip Pokorný).
- Updated documentation on `skip_header` parameter (PR2313 by Filip Pokorný).
- Viriback Unsafe Sites feed replaced with Viriback C2 Tracker. (PR2266 by Filip Pokorný)
- Netlab 360 Mirai Scanner feed removed as it is discontinued. (PR2266 by Filip Pokorný)
- Benkow Malware Panels Tracker feed changed parser configuration. (PR2266 by Filip Pokorný)
- Taichung feed removed as it is discontinued. (PR2266 by Filip Pokorný)
- Added new URL Expert bot. (PR2315 by Filip Pokorný)
Packaging
- Remove deleted `intelmq.bots.experts.sieve.validator` from executables in `setup.py` (PR2256 by Filip Pokorný).
- Run the geoip database cron-job twice a week (PR2285 by Filip Pokorný).
Tests
- Add GitHub Action to run regexploit on all Python, JSON and YAML files (PR2059 by Sebastian Wagner).
- `intelmq.lib.test`:
- Decorator `skip_ci` also detects `dpkg-buildpackage` environments by checking the environment variable `DEB_BUILD_ARCH` (PR2123 by Sebastian Wagner).
- Fixing regex to catchall after python version and process ID, add tests for it (PR2216 by Sebastian Waldbauer and Sebastian Wagner, fixes 2185)
- Also test on Python 3.10 (PR2140 by Sebastian Wagner).
- Switch from nosetests to pytest, as the former does not support Python 3.10 (PR2140 by Sebastian Wagner).
- CodeQL Github Actions `exponential backtracking on strings` fixed. (PR2148 by Sebastian Waldbauer, fixes 2138)
- Reverse DNS expert tests: remove outdated failing test `test_invalid_ptr` (PR2208 by Sebastian Wagner, fixes 2206).
- Add test dependency `requests_mock` to the `development` extra requirements in `setup.py` (PR2210 by Sebastian Wagner).
- Threshold Expert tests: Use environment variable `INTELMQ_PIPELINE_HOST` as redis host, analogous to other tests (PR2209 by Sebastian Wagner, fixes 2207).
- Remove codecov action as it failed regularly (PR2237 by Sebastian Wagner, fixes 2229).
- `intelmq.lib.test.BotTestCase`: Adds `skip_checks` variable to not fail on non-empty messages from calling `check` function (PR2315 by Filip Pokorný).
Tools
- `intelmqctl`:
- fix process manager initialization if run non-interactively, as intelmqdump does it (PR2189 by Sebastian Wagner, fixes 2188).
- `check`: handle `SyntaxError` in bot modules and report it without breaking execution (fixes 2177)
- Privilege drop before logfile creation (PR2277 by Sebastian Waldbauer, fixes 2176)
- `intelmqsetup`: Revised installation of manager by building the static files at setup, not build time, making it behave more meaningful. Requires intelmq-manager >= 3.1.0 (PR2198 by Sebastian Wagner, fixes 2197).
- `intelmqdump`: Respected global and per-bot custom settings of `logging_path` (fix 1605).
Contrib
- logrotate: Move compress and ownership rules to the IntelMQ-blocks to prevent that they apply to other files (PR2111 by Sebastian Wagner, fixes 2110).
Known issues
This is short list of the most important known issues. The full list can be retrieved from [GitHub](https://github.com/certtools/intelmq/labels/bug?page=2&q=is%3Aopen+label%3Abug).
- intelmq_psql_initdb does not work for SQLite (2202).
- intelmqsetup: should install a default state file (2175).
- Misp Expert - Crash if misp event already exist (2170).
- Turris greylist has been updated (2167).
- Spamhaus CERT parser uses wrong field (2165).
- Custom headers ignored in HTTPCollectorBot (2150).
- Missing commas in SQL query for separate Events table (2125).
- intelmqctl log: parsing syslog does not work (2097).
- Bash completion scripts depend on old JSON-based configuration files (2094).
- Bot configuration examples use JSON instead of YAML (2066).
- Bots started with IntelMQ-API/Manager stop when the webserver is restarted (952).
- Corrupt dump files when interrupted during writing (870).