Webchanges

Latest version: v3.30.0

Safety actively analyzes 723954 Python packages for vulnerabilities to keep your Python projects secure.

Page 5 of 10

3.16

===================
2023-12-07

Added
-----
* The HTTP/2 network protocol (the same used by major browsers) is now used in ``url`` jobs. This allows the
monitoring of certain websites who block requests made with older protocols like HTTP/1.1. This is implemented by
using the ``HTTPX`` and ``h2`` HTTP client libraries instead of the ``requests`` one used previously.

Notes:

- Handling of data served by sites whose encoding is misconfigured is done slightly differently by ``HTTPX``, and if
you newly encounter instances where extended characters are rendered as ``�`` try adding ``encoding:
ISO-8859-1`` to that job.
- To revert to the use of the ``requests`` HTTP client library, use the new job sub-directive ``http_client:
requests`` (in individual jobs or in the configuration file for all ``url`` jobs) and install ``requests`` by
running ``pip install --upgrade webchanges[requests]``.
- If the system is misconfigured and the ``HTTPX`` HTTP client library is not found, an attempt to use the
``requests`` one will be made. This behaviour is transitional and will be removed in the future.
- HTTP/2 is theoretically faster than HTTP/1.1 and preliminary testing confirmed this.

* New ``pypdf`` filter to convert pdf to text **without having to separately install OS dependencies**. If you're
using ``pdf2text`` (and its OS dependencies), I suggest you switch to ``pypdf`` as it's much faster; however do note
that the ``raw`` and ``physical`` sub-directives are not supported. Install the required library by running ``pip
install --upgrade webchanges[pypdf]``.
* New ``absolute_links`` filter to convert relative links in HTML ``<a>`` tags to absolute ones. This filter is not
needed if you are already using the ``beautify`` or ``html2text`` filters (requested by by `Paweł Szubert
<https://github.com/pawelpbm>`__ in `#62 <https://github.com/mborsetti/webchanges/issues/62>`__).
* New ``{jobs_files}`` substitution for the ``subject`` of the ``email`` reporter. This will be replaced by the
name of the jobs file(s) different than the default ``jobs.yaml`` in parentheses, with a prefix of ``jobs-`` in the
name removed. To use, replace the ``subject`` line for your reporter(s) in ``config.yaml`` with e.g. ``[webchanges]
{count} changes{jobs_files}: {jobs}``.
* ``html`` reports now have a configurable ``title`` to set the HTML document title, defaulting to
``[webchanges] {count} changes{jobs_files}: {jobs}``.
* Added reference to a Docker implementation to the documentation (requested by by `yubiuser
<https://github.com/yubiuser>`__ in `#64 <https://github.com/mborsetti/webchanges/issues/64>`__).

Changed
-------
* ``url`` jobs will use the ``HTTPX`` library instead of ``requests`` if it's installed since it uses the HTTP/2 network
protocol (when the ``h2`` library is also installed) as browsers do. To revert to the use of ``requests`` even if
``HTTPX`` is installed on the system, add ``http_client: requests`` to the relevant jobs or make it a default by
editing the configuration file to add the sub-directive ``http_client: requests`` for ``url`` jobs under
``job_defaults``.
* The ``beautify`` filter converts relative links to absolute ones; use the new ``absolute_links: false``
sub-directive to disable.

Internal
--------
* Removed transitional support for the ``beautifulsoup<4.11`` library (i.e. older than 7 April 2022) for the
``beautify`` filter.
* Removed dependency on the ``requests`` library and its own dependency on the ``urllib3`` library.
* Code cleanup, including removing support for Python 3.8.

3.15

===================
2023-10-25

Added
-----
* Support for Python 3.12.
* ``data_as_json`` job directive for ``url`` jobs to indicate that ``data`` entered as a dict should be
serialized as JSON instead of urlencoded and, if missing, the header ``Content-Type`` set to ``application/json``
instead of ``application/x-www-form-urlencoded``.

Changed
-------
* Improved error handling and documentation on the need of an external install when using ``parser: html5lib`` with the
``bs4`` method of the ``html2text`` filter and added ``html5lib`` as an optional dependency keyword (thanks to
`101Dude <https://github.com/101Dude>`__'s report in `59 <https://github.com/mborsetti/webchanges/issues/59>`__).

Removed
-------
* Support for Python 3.8. A reminder that older Python versions are supported for 3 years after being obsoleted by a
new major release (i.e. about 4 years since their original release).

Internals
---------
* Upgraded build environment to use the ``build`` frontend and ``pyproject.toml``, eliminating ``setup.py``.
* Migrated to ``pyproject.toml`` the configuration of all tools who support it.
* Increased the default ``timeout`` for ``url`` jobs with ``use_browser: true`` (i.e. using Playwright) to 120 seconds.

3.14

===================
2023-09-01

Added
-----
* When running in verbose (``-v``) mode, if a ``url`` job with ``use_browser: true`` fails with a Playwright error,
capture and save in the temporary folder a screenshot, a full page image, and the HTML contents of the page at the
moment of the error (see logs for filenames).

3.13

===================
2023-08-28

Added
-----
* Reports have a new ``separate`` configuration option to split reports into one-per-job.
* ``url`` jobs without ``use_browser`` have a new ``retries`` directive to specify the number of times to retry a
job that errors before giving up. Using ``retries: 1`` or higher will often solve the ``('Connection aborted.',
ConnectionResetError(104, 'Connection reset by peer'))`` error received from a misconfigured server at the first
connection.
* ``remove_duplicates`` filter has a new ``adjacent`` sub-directive to de-duplicate non-adjacent lines or items.
* ``css`` and ``xpath`` have a new ``sort`` subfilter to sort matched elements lexicographically.
* Command line arguments:

* New ``--footnote`` to add a custom footnote to reports.
* New ``--change-location`` to keep job history when the ``url`` or ``command`` changes.
* ``--gc-database`` and ``--clean-database`` now have optional argument ``RETAIN-LIMIT`` to allow increasing
the number of retained snapshots from the default of 1.
* New ``--detailed-versions`` to display detailed version and system information, inclusive of the versions of
dependencies and, in certain Linux distributions (e.g. Debian), of system libraries. It also reports available
memory and disk space.

Changed
-------
* ``command`` jobs now have improved error reporting which includes the error text from the failed command.
* ``--rollback-database`` now confirms the date (in ISO-8601 format) to roll back the database to and, if
**webchanges** is being run in interactive mode, the user will be asked for positive confirmation before proceeding
with the un-reversible deletion.

Internals
---------
* Added `bandit <https://github.com/PyCQA/bandit>`__ testing to improve the security of code.
* ``headers`` are now turned into strings before being passed to Playwright (addresses the error
``playwright._impl._api_types.Error: extraHTTPHeaders[13].value: expected string, got number``).
* Exclude tests from being recognized as package during build (contributed by `Max
<https://github.com/aragon999>`__ in `#54 <https://github.com/mborsetti/webchanges/pull/54>`__).
* Refactored and cleaned up some tests.
* Initial testing with Python 3.12.0-rc1, but a reported bug in ``typing.TypeVar`` prevents the ``pyee`` dependency
of ``playwright`` from loading, causing a failure. Awaiting for fix in Python 3.12.0-rc2 to retry.

3.12

===================
2022-11-19

Added
-----
* Support for Python 3.11. Please note that the ``lxml`` dependency may fail to install on Windows due to
`this <https://bugs.launchpad.net/lxml/+bug/1977998>`__ bug and that therefore for now **webchanges** can only be
run in Python 3.10 on Windows. [Update: ``lxml wheels`` for Python 3.11 on Windows are available as of 2022-12-13].

Removed
-------
* Support for Python 3.7. As a reminder, older Python versions are supported for 3 years after being obsoleted by a new
major release; support for Python 3.8 will be removed on or about 5 October 2023.

Fixed
-----
* Job sorting for reports is now case-insensitive.
* Documentation on how to anonymously monitor GitHub releases (due to changes in GitHub) (contributed by `Luis Aranguren
<https://github.com/mercurytoxic>`__ `upstream <https://github.com/thp/urlwatch/issues/723>`__).
* Handling of ``method`` subfilter for filter ``html2text`` (reported by `kongomondo <https://github.com/kongomondo>`__
`upstream <https://github.com/thp/urlwatch/issues/588>`__).

Internals
---------
* Jobs base class now has a ``__is_browser__`` attribute, which can be used with custom hooks to identify jobs that run
a browser so they can be executed in the correct parallel processing queue.
* Fixed static typing to conform to the latest mypy checks.
* Extended type checking to testing scripts.

3.11

===================
2022-09-22

Notice
------
Support for Python 3.7 will be removed on or about 22 October 2022 as older Python versions are supported for 3
years after being obsoleted by a new major release.

Added
-----
* The new ``no_conditional_request`` directive for ``url`` jobs turns off conditional requests for those extremely rare
websites that don't handle it (e.g. Google Flights).
* Selecting the database engine and the maximum number of changed snapshots saved is now set through the configuration
file, and the command line arguments ``--database-engine`` and ``--max-snapshots`` are used to override such
settings. See documentation for more information. Suggested by `jprokos <https://github.com/jprokos>`__ in `#43
<https://github.com/mborsetti/webchanges/issues/43>`__.
* New configuration setting ``empty-diff`` within the ``display`` configuration for backwards compatibility only:
use the ``additions_only`` job directive instead to achieve the same result. Reported by
`bbeevvoo <https://github.com/bbeevvoo>`__ in `#47 <https://github.com/mborsetti/webchanges/issues/47>`__.
* Aliased the command line arguments ``--gc-cache`` with ``--gc-database``, ``--clean-cache`` with ``--clean-database``
and ``--rollback-cache`` with ``--rollback-database`` for clarity.
* The configuration file (e.g. ``conf.yaml``) can now contain keys starting with a ``_`` (underscore) for remarks (they
are ignored).

Changed
-------
* Reports are now sorted alphabetically and therefore you can use the ``name`` directive to affect the order by which
your jobs are displayed in reports.
* Implemented measures for ``url`` jobs using ``browser: true`` to avoid being detected: **webchanges** now passes all
the headless Chrome detection tests `here
<https://intoli.com/blog/not-possible-to-block-chrome-headless/chrome-headless-test.html>`__.
Brought to attention by `amammad <https://github.com/amammad>`__ in `#45
<https://github.com/mborsetti/webchanges/issues/45>`__.
* Running ``webchanges --test`` (without specifying a JOB) will now check the hooks file (if any) for syntax errors in
addition to the config and jobs file. Error reporting has also been improved.
* No longer showing the the text returned by the server when a 404 - Not Found error HTTP status code is returned by for
all ``url`` jobs (previously only for jobs with ``use_browser: true``).

Fixed
-----
* Bug in command line arguments ``--config`` and ``--hooks``. Contributed by
`Klaus Sperner <https://github.com/klaus-tux>`__ in PR `#46 <https://github.com/mborsetti/webchanges/pull/46>`__.
* Job directive ``compared_versions`` now works as documented and testing has been added to the test suite. Reported by
`jprokos <https://github.com/jprokos>`__ in `#43 <https://github.com/mborsetti/webchanges/issues/43>`__.
* The output of command line argument ``--test-differ`` now takes into consideration ``compared_versions``.
* Markdown containing code in a link text now converts correctly in HTML reports.

Internals
---------
* The job ``kind`` of ``shell`` has been renamed ``command`` to better reflect what it does and the way it's described
in the documentation, but ``shell`` is still recognized for backward compatibility.
* Readthedocs build upgraded to Python 3.10

Page 5 of 10

Releases

Has known vulnerabilities

Previous Next

Webchanges

Page 5 of 10

3.16

3.15

3.14

3.13

3.12

3.11

Page 5 of 10

Links

Releases