Advertools

Latest version: v0.16.1

Safety actively analyzes 681812 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 6 of 8

0.8.0

------------------

* Added
- New module `youtube` connecting to all GET requests in API
- `extract_numbers` new function
- `emoji_search` new function
- `emoji_df` new variable containing all emoji as a DataFrame

* Changed
- Emoji database updated to v13.0
- `serp_goog` with expanded `pagemap` and metadata

* Fixed
- `serp_goog` errors, some parameters not appearing in result
df
- `extract_numbers` issue when providing dash as a separator
in the middle

0.7.3

------------------

* Added
- New function `extract_exclamations` very similar to
`extract_questions`
- New function `extract_urls`, also counts top domains and
top TLDs
- New keys to `extract_emoji`; `top_emoji_categories`
& `top_emoji_sub_categories`
- Groups and sub-groups to `emoji db`

0.7.2

------------------

* Changed
- Emoji regex updated
- Simpler extraction of Spanish `questions`

0.7.1

------------------

* Fixed
- Missing __init__ imports.

0.7.0

------------------

* Added
- New `extract_` functions:

* Generic `extract` used by all others, and takes
arbitrary regex to extract text.
* `extract_questions` to get question mark statistics, as
well as the text of questions asked.
* `extract_currency` shows text that has currency symbols in it, as
well as surrounding text.
* `extract_intense_words` gets statistics about, and extract words with
any character repeated three or more times, indicating an intense
feeling (+ve or -ve).

- New function `word_tokenize`:

* Used by `word_frequency` to get tokens of
1,2,3-word phrases (or more).
* Split a list of text into tokens of a specified number of words each.

- New stop-words from the ``spaCy`` package:

**current:** Arabic, Azerbaijani, Danish, Dutch, English, Finnish,
French, German, Greek, Hungarian, Italian, Kazakh, Nepali, Norwegian,
Portuguese, Romanian, Russian, Spanish, Swedish, Turkish.

**new:** Bengali, Catalan, Chinese, Croatian, Hebrew, Hindi, Indonesian,
Irish, Japanese, Persian, Polish, Sinhala, Tagalog, Tamil, Tatar, Telugu,
Thai, Ukrainian, Urdu, Vietnamese

* Changed
- `word_frequency` takes new parameters:
* `regex` defaults to words, but can be changed to anything '\S+'
to split words and keep punctuation for example.

* `sep` not longer used as an option, the above `regex` can
be used instead

* `num_list` now optional, and defaults to counts of 1 each if not
provided. Useful for counting `abs_freq` only if data not
available.

* `phrase_len` the number of words in each split token. Defaults
to 1 and can be set to 2 or higher. This helps in analyzing phrases
as opposed to words.

- Parameters supplied to `serp_goog` appear at the beginning
of the result df
- `serp_youtube` now contains `nextPageToken` to make
paginating requests easier

0.6.0

------------------

* New function
- `extract_words` to extract an arbitrary set of words
* Minor updates
- `ad_from_string` slots argument reflects new text
ad lenghts
- `hashtag` regex improved

Page 6 of 8

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.