New features
Location scraper
Ability to scrape Instagram Location pages.
Sample usage
python
from instascrape import Location
url = "https://www.instagram.com/explore/locations/212988663/new-york-new-york/"
new_york = Location(url)
new_york.scrape()
print(f"{new_york.amount_of_posts:,} people have been to New York"
>>> 61,202,403 people have been to New York
Optional header for requests
Now supports passing an optional browser header to the `scrape` method of all scraper objects. Syntax is exactly the same as a header `dict` you would pass to `requests.get`.
The default header is
python
headers={"User-Agent": "user-agent: Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Mobile Safari/537.36 Edg/87.0.664.57"}
Sample usage is
python
from instascrape import Profile
headers={"User-Agent": "user-agent: Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Mobile Safari/537.36 Edg/87.0.664.57"}
google = Profile("google")
google.scrape(headers=headers)
Fixes
It appears Instagram tightened restrictions overnight, all GET requests from the library were being returned 429 HTTP response status codes (Too Many Requests). Prior to now, `instascrape` did not pass or have any support for passing browser headers. This newest default and option to pass in headers seems to have returned library functioning for now. Keep an eye out for more robust session handling and better cookie support in later updates