Advertools

Latest version: v0.16.4

Safety actively analyzes 722032 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 8

0.15.0

-------------------

* Added
- Enable supplying request headers in ``sitemap_to_df``, contributed by `joejoinerr <https://github.com/joejoinerr>`_
- New function ``crawlytics.compare`` for comparing two crawls.
- New function ``crawlytics.running_crawls`` for getting data on currently running crawl jobs (\*NIX only for now).
- New parameter ``date_format`` to ``logs_to_df`` for custom date formats.

* Changed
- Removed the `relatedSite` parameter from ``serp_goog`` - deprecated.
- Update emoji regex and functionality to v15.1.

* Fixed
- Use int64 instead of int for YouTube count columns, contributed by `DanielP77 <https://github.com/DanielP77>`_

0.14.4

-------------------

* Fixed
- Use ``pd.NA`` instead of ``np.nan`` for empty values in ``url_to_df``.

0.14.3

-------------------

* Changed
- Use a different XPath expression for `body_text` while crawling.

0.14.2

-------------------

* Changed
- Allow ``sitemap_to_df`` to work on offline sitemaps.

0.14.1

-------------------

* Fixed
- Preserve the order of supplied URLs in the output of ``url_to_df``.

0.14.0

-------------------

* Added
- New module ``crawlytics`` for analyzing crawl DataFrames. Includes functions to
analyze crawl DataFrames (``images``, ``redirects``, and ``links``), as well as
functions to handle large files (``jl_to_parquet``, ``jl_subset``, ``parquet_columns``).
- New ``encoding`` option for ``logs_to_df``.
- Option to save the output of ``url_to_df`` to a parquet file.

* Changed
- Remove requirement to delete existing log output and error files if they exist.
The function will now overwrite them if they do.
- Autothrottling is enabled by default in ``crawl_headers`` to minimize being blocked.

* Fixed
- Always get absolute path for img src while crawling.
- Handle NA src attributes when extracting images.
- Change fillna(method="ffill") to ffill for ``url_to_df``.

Page 2 of 8

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.