--------
This is a major new release with important security and bug fixes, as well as
significant improvement in license detection.
Many thanks to every contributors that made this possible and in particular:
- Akanksha Garg akugarg
- Ayan Sinha Mahapatra AyanSinhaMahapatra
- Dennis Clark DennisClark
- François Granade farialima
- Hanna Modica hanna-modica
- Jelmer Vernooij jelmer
- Jono Yang JonoYang
- Konrad Weihmann priv-kweihmann
- Philippe Ombredanne pombredanne
- Pierre Tardy tardyp
- Sarita Singh itssingh
- Sebastian Thomas sebathomas
- Steven Esser majurg
- Till Jaeger LeChasseur
- Thomas Druez tdruez
Breaking API changes:
~~~~~~~~~~~~~~~~~~~~~
- The configure scripts for Linux, macOS and Windows have been entirely
refactored and should be considered as new. These are now only native scripts
(.bat on Windows and .sh on POSIX) and the Python script etc/configure.py
has been removed. Use the PYTHON_EXECUTABLE environment variable to point to
alternative non-default Python executable and this on all OSes.
Security updates:
~~~~~~~~~~~~~~~~~
- Update minimum versions and pinned version of thirdparty dependencies
to benefit from latest improvements and security fixes. This includes in
particular this issues:
- pkg:pypi/pygments: (low severity, limited impact) CVE-2021-20270, CVE-2021-27291
- pkg:pypi/lxml: (low severity, likely no impact) CVE-2021-28957
- pkg:pypi/nltk: (low severity, likely no impact) CVE-2019-14751
- pkg:pypi/jinja2: (low severity, likely no impact) CVE-2020-28493, CVE-2019-10906
- pkg:pypi/pycryptodome: (high severity) CVE-2018-15560 (dropped since no
longer used by pdfminer)
Outputs:
~~~~~~~~
- The JSON output packages section has a new "extra_data" attributes which is
a JSON object that can contain arbitrary data that are specific to a package
type.
License detection:
~~~~~~~~~~~~~~~~~~~
- The SPDX license list has been update to 3.13
- Add 42 new and update 45 existing licenses.
- Over 14,300 new and improved license detection rules have been added. A large
number of these (~13,400) are to avoid false positive detection.
Copyright detection:
~~~~~~~~~~~~~~~~~~~~
- Improved speed and fixed some timeout issues. Fixed minor misc. bugs.
- Allow calling copyright detection from text lines to ease integration
Package detection:
~~~~~~~~~~~~~~~~~~
- A new "extra_data" dictionary is now part of the "packages" data in the
returned JSON. This is used to store arbitrary type-specific data that do
cannot be fit in the Package data structure.
- The Debian copyright files license detection has been reworked and
significantly improved.
- The PyPI package detection and manifest parsing has been reworked and
significantly improved.
- The detection of Windows executables and DLLs metadata has been enabled.
These metadata are returned as packages.
Other:
~~~~~~~
- Most third-party libraries have been updated to their newer versions. Some
dependency constraints have been relaxed to help some usage as a library.
- The on-commit CI tests now validate that we can install from PyPI without
problem.
- Fix several installation issues.
- Add new function to detect copyrights from lines.