Shub-workflow

Latest version: v1.14.13.4

Safety actively analyzes 688917 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

1.14

--------------------------------

Backward incompatibility issues:

- Moved filesystem utils (S3 and GCS utils) from deliver folder into utils folder

Minor version changes:

- 1.14.0 generator crawl manager: acquire all jobs if crawlmanager doesn't have a flow id.
Added `get_canonical_spidername()` and `get_project_running_spiders()` helper methods on base script.
Added fshelper base script attribute for readily access to this helper tool
- 1.14.1 Added method for getting alive real time settings from ScrapyCloud
- 1.14.2 Added BaseMonitor class that is able to monitor aggregated stats on entire workflow jobs.
- 1.14.3 Mixin for provision of Sentry alert capabilities in monitor.
- 1.14.4 Monitor ratios
- 1.14.5 Finished job metadata hook
- 1.14.6 Allow to use project entry keyword in scrapinghub.yml as alternative to project numeric id, when passing command line --project-id option.
- 1.14.7 Mixin for provision of Slack alert capabilities in monitor.
- 1.14.8 Allow to load settings from SC when running script on local environment.
- 1.14.9 Added stats aggregation capabilities to crawlmanager
- 1.14.10 AlertSender class to allow both slack and sentry alerts combined.
- 1.14.11 Created SlackSender class for easier reusage of slack messaging
- 1.14.12 SlackSender: allow to send attachments
- 1.14.13 Extended monitor to be able to generate reports

1.13

---------------------------------------

Backward incompatibility issues:

- Some changes in delivery script interface

Minor version changes:
- 1.13.0 new configuration watchdog script
- 1.13.1 generator crawlmanager: added method for determining retry parameters when a job is retried from default `bad_outcome_hook()` method
- 1.13.2 generator crawlmanager: additional featuring on multiple spiders handling
- 1.13.3 configurable retry logging via environment variables
- 1.13.4 base script: some handlers for scheduling methods
- 1.13.5 additions in GCS utils
- 1.13.6 additions in GCS utils
- 1.13.7 base script: print stats on close
- 1.13.8 avoid multiple warnings on `kumo_settings()` function. Additions in GCS utils

1.12

---------------------------------------

Backward incompatibility issues:

- make installation of s3 and gcs depedencies optional (with shub-workflow[with-s3-tools] and shub-workflow[with-gcs-tools])

Minor version changes:
- 1.12.1 more typing addition and improvements, and related refactoring
- 1.12.2 generator crawlmanager: some methods for conditioning scheduling of new spider jobs
- 1.12.3 reimplementation of `upload_file` s3 util using boto3, for improved performance
- 1.12.4 improvements in job clonner class

1.11

-----------------------------------

Backward incompatibility issues:

- Definitively removed old legacy delivery class
- Dupefilter classes moved from delivery folder into utils


Minor version changes:

- 1.11.1 replaced `bloom_filter` dependency by `bloom_filter2`
- 1.11.2 typing improvements
- 1.11.3 method for removing jobs tags
- 1.11.4 method for reading log level from kumo and use on script main() function.
- 1.11.5 added methods for working with Google Cloud Storage, with same interface than already existin ones for AWS S3
- 1.11.6 generator crawlmanager: method for computing max jobs per spider
- 1.11.7 GCS additions
- 1.11.8 typing hint updates
- 1.11.9 added support for python 3.11

1.10

--------------------------------------

Backward incompatibility issues:

- Backward incompatibilities may come from the massive introduction of typing hints and fixes in types consistencies, specially when you override some methods.
Many of the incompatibities you may find can be seen in advance by using mypy in your project and using typing abundantly in the classes that
use shub-workflow.

Minor version changes:

- 1.10.1 continuation of typing hints massive adoption
- 1.10.2 continuation of typing hints massive adoptio
- 1.10.3 new BaseLoopScript class
- 1.10.4 `script_args()` context manager
- 1.10.5 some refactor and improvements in new delivery script.
- 1.10.6 crawl manager new async schedule mode
- 1.10.7 crawl manager extension of async scheduling
- 1.10.8 max running time for all scripts
- 1.10.9 performance improvements of resume feature
- 1.10.10 new method for async tagging of jobs
- 1.10.11 async tagging on delivery script
- 1.10.12 introduction of script stats
- 1.10.13 crawlmanager stats, delivery stats, AWS email utils.
- 1.10.14 graph manager ability to resume
- 1.10.15 graph manager `bad_outcome_hook()`
- 1.10.16 max running time feature on delivery script
- 1.10.17 spider loader object in base script
- 1.10.18 default implementation of `bad_outcome_hook()` on generator crawl manager
- 1.10.19 some refactoring of base script.
- 1.10.20 base script: resolve project id before parsing options
- 1.10.21 minor improvement in base script spider loader object
- 1.10.22 crawlmanager `finished_ok_hook()` method
- 1.10.23 generator crawlmanager that handle multiple spiders in same process (not good experimient, will probably be deprecated in future)

1.9

------------------------------------

Backward incompatibility issues:

- Python older than 3.8 is not supported anymore
- name attribute is now required for every subclass of WorkFlowManager (either as hardcoded attribute, or passed via command line required arguments)
- The Delivery script has been refactored. The old code is deprecated. It will be removed in future versions.

Minor version changes:

- 1.9.0 crawlmanager `bad_outcome_hook()`
- 1.9.1 crawlmanager ability to resume
- 1.9.2 implicit crawlmanager resumability via flow id (before a command line option was required)
- 1.9.3 performance improvements
- 1.9.4 name attribute logic fixes
- 1.9.5 delivery script refactor, old delivery code deprecated.
- 1.9.6 diverse fixes and improvements in base script, added `get_jobs_with_tags()` method.
- 1.9.7 refactors and new s3 helper method

Links

Releases

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.