Web-monitoring-diff

Latest version: v0.1.4

Safety actively analyzes 626118 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

0.1.4

This is a minor release that updates some of the lower-level parsing and diffing tools this package relies on:

- Updates the diff-match-patch implementation we rely on for simple text diffs to [fast_diff_match_patch v2.x](https://pypi.org/project/fast-diff-match-patch/>). (#126)

- Fix misconfigured dependency requirements for html5-parser. This should have no user impact, since there are no releases (yet) in the version range we were accidentally allowing for. (126)

- Support lxml v5.x. (163)

0.1.3

This release fixes some minor issues around content-type checking for HTML-related diffs (`html_diff_render` and `links_diff`). Both lean towards making content-type checking more lenient; our goal is to stop wasted diffing effort early *when we know it's not HTML,* not to only diff things are definitely HTML:

- Ignore invalid `Content-Type` headers. These happen fairly frequently in the wild — especially on HTML pages — and we now ignore them instead of treating them as implying the content is not HTML. ([76](https://github.com/edgi-govdata-archiving/web-monitoring-diff/pulls/76))

- Ignore the `application/x-download` content type. This content-type isn't really about the content, but is frequently used to make a browser download a file rather than display it inline. It no longer affects parsing or diffing. ([105](https://github.com/edgi-govdata-archiving/web-monitoring-diff/pulls/105))

This release also adds some nice sidebar links for documentation, the changelog, issues, and source code to PyPI. ([107](https://github.com/edgi-govdata-archiving/web-monitoring-diff/pulls/107))

0.1.2

- The server uses a pool of child processes to run diffs. If the pool breaks while running a diff, it will be re-created once, and, if it fails again, the server will now crash with an exit code of `10`. (An external process manager like Supervisor, Kubernetes, etc. can then decide how to handle the situation.) Previously, the diff would fail at this point, but server would try to re-create the process pool again the next time a diff was requested. You can opt-in to the old behavior by setting the `RESTART_BROKEN_DIFFER` environment variable to `true`. (49)

- The diff server now requires Sentry 1.x for error tracking.

0.1.2rc1

The server uses a pool of child processes to run diffs. If the pool breaks while running a diff, it will be re-created once, and, if it fails again, the server will now crash with an exit code of `10`. (An external process manager like Supervisor, Kubernetes, etc. can then decide how to handle the situation.) Previously, the diff would fail at this point, but server would try to re-create the process pool again the next time a diff was requested. You can opt-in to the old behavior by setting the `RESTART_BROKEN_DIFFER` environment variable to `true`. (49)

0.1.0

This project used to be a part of [web-monitoring-processing](https://github.com/edgi-govdata-archiving/web-monitoring-processing/), which contains a wide variety of libraries, scripts, and other tools for working with data across all the various parts of EDGI’s Web Monitoring project. The goal of this initial release is to create a new, more focused package containing the diff-releated tools so they can be more easily used by others.

This release is more-or-less the same code that was a part of `web-monitoring-processing`, although the public API has been rearranged very slightly to make sense in this new, stand-alone context.

Links

Releases

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.