Web-monitoring-diff

Latest version: v0.1.6

Safety actively analyzes 701595 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

0.1.6

Remove stray logging statements that should not have been included in v0.1.5. (194)

0.1.5

Treat `binary/octet-stream` as a generic media type, just like `application/octet-stream`, when trying to determine if content is not HTML. Even though `binary/octet-stream` is not a registered IANA media type, it turns out some AWS SDKs use it when uploading files to S3, so it’s not uncommon. (190)

0.1.4

This is a minor release that updates some of the lower-level parsing and diffing tools this package relies on:

- Updates the diff-match-patch implementation we rely on for simple text diffs to [fast_diff_match_patch v2.x](https://pypi.org/project/fast-diff-match-patch/>). (#126)

- Fix misconfigured dependency requirements for html5-parser. This should have no user impact, since there are no releases (yet) in the version range we were accidentally allowing for. (126)

- Support lxml v5.x. (163)

0.1.3

This release fixes some minor issues around content-type checking for HTML-related diffs (`html_diff_render` and `links_diff`). Both lean towards making content-type checking more lenient; our goal is to stop wasted diffing effort early *when we know it's not HTML,* not to only diff things are definitely HTML:

- Ignore invalid `Content-Type` headers. These happen fairly frequently in the wild — especially on HTML pages — and we now ignore them instead of treating them as implying the content is not HTML. ([76](https://github.com/edgi-govdata-archiving/web-monitoring-diff/pulls/76))

- Ignore the `application/x-download` content type. This content-type isn't really about the content, but is frequently used to make a browser download a file rather than display it inline. It no longer affects parsing or diffing. ([105](https://github.com/edgi-govdata-archiving/web-monitoring-diff/pulls/105))

This release also adds some nice sidebar links for documentation, the changelog, issues, and source code to PyPI. ([107](https://github.com/edgi-govdata-archiving/web-monitoring-diff/pulls/107))

0.1.2

- The server uses a pool of child processes to run diffs. If the pool breaks while running a diff, it will be re-created once, and, if it fails again, the server will now crash with an exit code of `10`. (An external process manager like Supervisor, Kubernetes, etc. can then decide how to handle the situation.) Previously, the diff would fail at this point, but server would try to re-create the process pool again the next time a diff was requested. You can opt-in to the old behavior by setting the `RESTART_BROKEN_DIFFER` environment variable to `true`. (49)

- The diff server now requires Sentry 1.x for error tracking.

0.1.2rc1

The server uses a pool of child processes to run diffs. If the pool breaks while running a diff, it will be re-created once, and, if it fails again, the server will now crash with an exit code of `10`. (An external process manager like Supervisor, Kubernetes, etc. can then decide how to handle the situation.) Previously, the diff would fail at this point, but server would try to re-create the process pool again the next time a diff was requested. You can opt-in to the old behavior by setting the `RESTART_BROKEN_DIFFER` environment variable to `true`. (49)

Page 1 of 2

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.