Cdxsummary

Latest version: v0.1.1b5

Safety actively analyzes 623965 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

0.1.1b5

Changes in release version `0.1.1b5` include:

* Soft-wrapping rows of wide tables
* Dockerfile layer cache improvement

0.1.1b4

Changes in release version `0.1.1b4` include:

* Summarize year and month distribution
* Rename `media` JSON key to `mimestatus` for consistency
* Remove explicit port `80` from original URI in samples
* Add a CDX record parser with more strict validation
* Compact table padding and enable column wrap
* Add CDX API summary support
* Print progress and errors

0.1.1b3

Changes in release version `0.1.1b3` include:

* Add custom user-agent header to HTTP requests in the form of `cdxsummary/{version}`
* Add path and query segment report

0.1.1b2

* Remove hyperlink highlight/underline in sample report
* Add features list and sample output in README

0.1.1b1

Initial public release version `0.1.1b1` with the following features:

* Summarize local CDX files or remote ones over HTTP
* Handle `gz` and `bz2` compression seamlessly
* Handle CDX data input to STDIN from pipe
* Support Internet Archive Petabox web item summarization
* Seamless authorization to Internet Archive via the `ia` CLI tool
* Human-friendly summary by default, but support summarized or detailed JSON reports
* Self-aware, as the input can be a previously generated JSON report in place of CDX data
* Summary includes:
* An overview of numbers of captures, consecutive unique URIs, unique hosts, accumulated WARC records size, and the first and last datetimes
* A grid of media types and status codes and their respective capture counts
* Top-N (configurable) hosts and their capture counts
* A random sample of N (configurable) memento URIs for `200 OK` HTML pages

Links

Releases

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.