* requests_toolbelt is now required
** this is to solve PR16 / Issue21
** the toolbelt and built-in versions of get_encodings_from_content required different workarounds
* the output of urlparse is now cached onto the parser instance.
** perhaps this will be global cache in the future
* MetadataParser now accepts `cached_urlparser`
** default: True
options: True: use a instance of UrlParserCacheable(maxitems=30)
: INT: use a instance of UrlParserCacheable(maxitems=cached_urlparser)
: None/False/0 - use native urlparse
: other truthy values - use as a custom urlparse
* addressing issue 17 (https://github.com/jvanasco/metadata_parser/issues/17) where `get_link_` logic does not handle schemeless urls.
** `MetadataParser.get_metadata_link` will now try to upgrade schemeless links (e.g. urls that start with "//")
** `MetadataParser.get_metadata_link` will now check values against `FIELDS_REQUIRE_HTTPS` in certain situations to see if the value is valid for http
** `MetadataParser.schemeless_fields_upgradeable` is a tuple of the fields which can be upgradeable. this defaults to a package definition, but can be changed on a per-parser bases.
The defaults are:
'image',
'og:image', 'og:image:url', 'og:audio', 'og:video',
'og:image:secure_url', 'og:audio:secure_url', 'og:video:secure_url',
** `MetadataParser.schemeless_fields_disallow` is a tuple of the fields which can not be upgradeable. this defaults to a package definition, but can be changed on a per-parser bases.
The defaults are:
'canonical',
'og:url',
** `MetadataParser.get_url_scheme()` is a new method to expose the scheme of the active url
** `MetadataParser.upgrade_schemeless_url()` is a new method to upgrade schemeless links
it accepts two arguments: url and field(optional)
if present, the field is checked against the package tuple FIELDS_REQUIRE_HTTPS to see if the value is valid for http
'og:image:secure_url',
'og:audio:secure_url',
'og:video:secure_url',