Checklists-scrapers

Latest version: v0.2.3

Safety actively analyzes 681866 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 3

0.2.3

-------------

* Added request to WorldBirds parser to select 'English' as the language
to display pages in. The account preferences do have a option to select
language but this is not apparently used.

* Fixed the extraction of checklist attributes, location, protocol, etc. to
use the labels from the table where the values are displayed rather than
relying on fixed row indices since the size of the table changes (more
than it used to?).

0.2.2

-------------

* Fixed error when parsing lists of observers from the checklists web
page on eBird.

0.2.1

-------------

* Locked down version numbers for packages.

0.2

-----------

* Cleaned up organisation of docs source files.

* Removed CHECKLISTS_ prefix from settings and package constants.

0.1.1

-------------

* Updated the ebird scraper so the breakdown of a count by age and sex
extracted from the checklist web page has an identifier with the form
DET<id> where <id> is a two-digit integer created from the index of
the cell in the table.

* Checklists are now saved in JSON format with an indent of 4 to make them
easier to read (and debug).

0.1

-----------

* Renamed the constant used to identify the version number of the downloaded
checklists from CHECKLIST_FILE_FORMAT_VERSION to CHECKLISTS_DOWNLOAD_FORMAT.
Similarly renamed CHECKLIST_FILE_LANGUAGE to CHECKLISTS_DOWNLOAD_LANGUAGE.
* Renamed the project and the main module to checklists_scrapers. The
project is part of a group designed to work with the django-checklists
database and the association is clearer with the change in name.
* Changed the licence from GPLv3 to BSD so it is compatible with the other
projects in the checklists "group".
* Changes the prefix used on the environment variables from "CHECKLISTING"
to "CHECKLISTS" to minimise the number that must be set when used the
scrapers with the django-checklists app.
* Added scripts to run the site tests for each scraper. This makes it
easier to pass in the parameters for the site to be scraped rather than
defining them in the settings.
* Configuring the scrapers is now done through environment variables rather
than a local settings file. This makes it much easier to install and run
the scrapers, particularly from the most common use-case of calling them
from cron or a similar scheduler.
* Removed the scrapy project file since the settings file will typically be
set using an environment variable.
* Fixed the version of scrapy to use, to 0.16.4 for now, since there are
(distracting), deprecated calls when using the latest version (0.20.2).
* Updated the logging so records found only on the web page for an eBird
checklist are only reported as warnings when the logging level is set to
DEBUG. This avoids any confusion in production since the warnings are only
used to debug issues with the eBird API.
* The status report is now written to the directory where checklists are
downloaded when the logging level is set to 'DEBUG'.
* Fixed bugs caused by unexpected missing fields when downloading checklists
from eBird.

Page 1 of 3

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.