PyPi: Scrapy

CVE-2021-41125

Safety vulnerability ID: 42057

This vulnerability was reviewed by experts

The information on this page was manually curated by our Cybersecurity Intelligence Team.

Created at Oct 06, 2021 Updated at May 14, 2024
Scan your Python projects for vulnerabilities →

Advisory

Scrapy versions 1.8.1 and 2.5.1 include a fix for CVE-2021-41125: If you use "HttpAuthMiddleware" (i.e. the "http_user" and "http_pass" spider attributes) for HTTP authentication, all requests will expose your credentials to the request target. This includes requests generated by Scrapy components, such as "robots.txt" requests sent by Scrapy when the "ROBOTSTXT_OBEY" setting is set to "True", or as requests reached through redirects. It's advised upgrading and using the new "http_auth_domain" spider attribute to control which domains are allowed to receive the configured HTTP authentication credentials. If you cannot upgrade to a secure version, set your HTTP authentication credentials on a per-request basis, using for example the "w3lib.http.basic_auth_header" function to convert your credentials into a value that you can assign to the "Authorization" header of your request, instead of defining them globally using "HttpAuthMiddleware".
https://github.com/scrapy/scrapy/security/advisories/GHSA-jwqp-28gf-p498
http://doc.scrapy.org/en/latest/topics/downloader-middleware.html#module-scrapy.downloadermiddlewares.httpauth
https://github.com/scrapy/scrapy/commit/b01d69a1bf48060daec8f751368622352d8b85a6
https://w3lib.readthedocs.io/en/latest/w3lib.html#w3lib.http.basic_auth_header

Affected package

scrapy

Latest version: 2.11.2

A high-level Web Crawling and Web Scraping framework

Affected versions

Fixed versions

Vulnerability changelog

Scrapy is a high-level web crawling and scraping framework for Python. If you use `HttpAuthMiddleware` (i.e. the `http_user` and `http_pass` spider attributes) for HTTP authentication, all requests will expose your credentials to the request target. This includes requests generated by Scrapy components, such as `robots.txt` requests sent by Scrapy when the `ROBOTSTXT_OBEY` setting is set to `True`, or as requests reached through redirects. Upgrade to Scrapy 2.5.1 and use the new `http_auth_domain` spider attribute to control which domains are allowed to receive the configured HTTP authentication credentials. If you are using Scrapy 1.8 or a lower version, and upgrading to Scrapy 2.5.1 is not an option, you may upgrade to Scrapy 1.8.1 instead. If you cannot upgrade, set your HTTP authentication credentials on a per-request basis, using for example the `w3lib.http.basic_auth_header` function to convert your credentials into a value that you can assign to the `Authorization` header of your request, instead of defining your credentials globally using `HttpAuthMiddleware`. See CVE-2021-41125.


CONFIRM:https://github.com/scrapy/scrapy/security/advisories/GHSA-jwqp-28gf-p498: https://github.com/scrapy/scrapy/security/advisories/GHSA-jwqp-28gf-p498
MISC:http://doc.scrapy.org/en/latest/topics/downloader-middleware.html#module-scrapy.downloadermiddlewares.httpauth: http://doc.scrapy.org/en/latest/topics/downloader-middleware.html#module-scrapy.downloadermiddlewares.httpauth
MISC:https://github.com/scrapy/scrapy/commit/b01d69a1bf48060daec8f751368622352d8b85a6: https://github.com/scrapy/scrapy/commit/b01d69a1bf48060daec8f751368622352d8b85a6
MISC:https://w3lib.readthedocs.io/en/latest/w3lib.html#w3lib.http.basic_auth_header: https://w3lib.readthedocs.io/en/latest/w3lib.html#w3lib.http.basic_auth_header

Resources

Use this package?

Scan your Python project for dependency vulnerabilities in two minutes

Scan your application

Severity Details

CVSS Base Score

MEDIUM 6.5

CVSS v3 Details

MEDIUM 6.5
Attack Vector (AV)
NETWORK
Attack Complexity (AC)
LOW
Privileges Required (PR)
LOW
User Interaction (UI)
NONE
Scope (S)
UNCHANGED
Confidentiality Impact (C)
HIGH
Integrity Impact (I)
NONE
Availability Availability (A)
NONE

CVSS v2 Details

MEDIUM 4.0
Access Vector (AV)
NETWORK
Access Complexity (AC)
LOW
Authentication (Au)
SINGLE
Confidentiality Impact (C)
PARTIAL
Integrity Impact (I)
NONE
Availability Impact (A)
NONE