Metadata-parser

Latest version: v0.12.2

Safety actively analyzes 714875 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 5 of 14

0.9.10

* slight reorder internally of TLD extract support

0.9.9

* inspecting `requests` errors for a response and using it if possible
* this will now try to validate urls if the `tldextract` library is present.
this feature can be disabled with a global toggle

import metadata_parser
metadata_parser.USE_TLDEXTRACT = False

0.9.7

updated the following functions to test for RFC valid characters in the url string
some websites, even BIG PROFESSIONAL ONES, will put html in here.
idiots? amateurs? lazy? doesn't matter, they're now our problem. well, not anymore.
* get_url_canonical
* get_url_opengraph
* get_metadata_link

0.9.6

this is being held for an update to the `requests` library
* made the following arguments to `MetadataParser.fetch_url()` default to None - which will then default to the class setting. they are all passed-through to `requests.get`
** `ssl_verify`
** `allow_redirects`
** `requests_timeout`
* removed `force_parse` kwarg from `MetadataParser.parser`
* added 'metadata_parser.RedirectDetected' class. if allow_redirects is False, a detected redirect will raise this.
* added 'metadata_parser.NotParsableRedirect' class. if allow_redirects is False, a detected redirect will raise this if missing a Location.
* added `requests_session` argument to `MetadataParser`
* starting to use httpbin for some tests
* detecting JSON documents
* extended NotParseable exceptions with the MetadataParser instance as `metadataParser`
* added `only_parse_http_ok` which defaults to True (legacy). submitting False will allow non-http200 responses to be parsed.
* shuffled `fetch_url` logic around. it will now process more data before a potential error.
* working on support for custom request sessions that can better handle redirects (requires patch or future version of requests)
* caching the peername onto the response object as `_mp_peername` [ _m(etadata)p(arser)_peername ]. this will allow it to be calculated in a redirect session hook. (see tests/sessions.py)
* added `defer_fetch` argument to `MetadataParser.__init__`, default ``False``. If ``True``, this will overwrite the instance's `deferred_fetch` method to actually fetch the url. this strategy allows for the `page` to be defined and response history caught. Under this situation, a 301 redirecting to a 500 can be observed; in the previous versions only the 500 would be caught.
* starting to encapsulate everything into a "parsed result" class
* fixed opengraph minimum check
* added `MetadataParser.is_redirect_unique`
* added `DummyResponse.history`

0.9.5

* failing to load a document into BeautifulSoup will now catch the BS error and raise NotParsable

0.9.4

* created `MetadataParser.get_url_canonical`
* created `MetadataParser.get_url_opengraph`
* `MetadataParser.get_discrete_url` now calls `get_url_canonical` and `get_url_opengraph`

Page 5 of 14

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.