Wrangles

Latest version: v1.12.0

Safety actively analyzes 683322 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 5

1.6.1

- Fix a bug with the pypi deployment.

1.6.0

Highlights

Extract.AI
Added extract.ai. Use chatGPT to extract data using plain descriptions.
yml
wrangles:
- extract.ai:
api_key: ${OPENAI_API_KEY}
output:
length:
description: >-
Any lengths found in the data
such as cm, m, ft, etc.
type:
description: >-
The type of item in the data
such as spanner, cellphone, etc.


Python
Added a python wrangle. This allows executing simple python commands inline within a recipe. For more complex python, use custom functions. This is evaluated once per row. Row values are referenced by the column name.
yml
wrangles:
- python:
command: [x.upper() for x in my_column]
output: result


Features
- Allow using custom functions for recipes called using a model ID.
- Allow the console command (wrangles.recipe) to use all recipe features, such as calling a recipe by model ID or URL.
- Added an optional timeout parameter for recipes to set a time limit in seconds. If omitted, the time is unlimited.
- Added a clear method to the memory connector to clear all saved data.
- Improved convert fractions to decimals to deal with split fractions such as 1-1/2.
- Added create.embeddings to generate embeddings for text.
- Allow parameterizing SQL queries and connecting to sandbox environments with the salesforce connector.

Bugs
- Prevent remove words from automatically capitalizing by default.

Misc
- Added an additional test job within the built container.
- Fixed numpy version within the container to 1.24.3 due to an issue with the optimized build with newer versions.

1.5.0

Highlights
Matrix Connector
Added the matrix connector. This allows running multiple writes in parallel based on variables. e.g. run once each for a list of categories.
yml
Create files for a list of categories
a file for Tools.xlsx, PPE.xlsx and Electrical.xlsx will be created
write:
- matrix:
variables:
category: ["Tools", "PPE", "Electrical"]
write:
- file:
name: ${category}.xlsx
where: category = ?
where_params:
- ${category}


select.element
This supports python syntax for selecting from lists and dicts. e.g. column_name["key"][0].
yml
wrangles:
- select.element:
input: input_column["key"][0]
output: output_column


select.group_by
Easily aggregate and group data.
yml
wrangles:
- select.group_by:
by:
- category1
- category2
sum: sum_me
min: min_me
max: max_me


Features

- Added convert to and from YAML.
- Allow setting a default value for split.dictionary.
- Added the memory connector - save data in memory for communication between successive wrangles/recipes.
- Support recipe in recipe syntax, in addition to externally referenced recipes such as from files.
- Enable running a recipe from a model ID.

Bugs / Misc
- Fixed an issue with the schema for convert.to_json.
- Set default ensure_ascii = False for convert.to_json to prevent issues with double encoding.
- Fixed a bug that where can't use a column not included in the write.
- Recipes read from a file now assume UTF-8 encoding for all operating systems.
- Display the name of the wrangle in error messages for wrangles.

1.4.1

- Update train connector to allow dynamic columns.
- Provide a clear error message if the user does not return a dataframe from a read or wrangle custom function.
- Remove problematic huggingface test that fails due to rate limiting.
- Provide a clear error message if a user tries to access a model they don't have permission for.
- Add pip install test.
- Remove python 3.7 tests as this is now end of life.
- Add MacOS tests.

1.4.0

Highlights
Where
Most wrangles now support using where to filter the rows the wrangle is applied to.
yml
wrangles:
- classify:
input: my column
where: status = 'to_classify'


Features
- Allow attachments for notifications that support them (e.g. emails)
- Allow column names with spaces to be used in custom functions by replacing with an underscore.
- Allow using json options such as indent for to_json.
- Added a run function to the postgres connector.
- Allow parameterizing various SQL queries for security, including internal recipe SQL such as where for write.

Bugs / Misc
- Use max_level = 0 for split.dictionary.
- Improved error messages when something fails when trying to read a wrangle's contents.
- Improved the error messages when providing the wrong model ID.
- Various schema fixes and improvements.
- Fixed an import related to an issue with pymssql.
- Fix replace failing when using numbers.

1.3.1

- Bugfix: Row-based custom functions did not work correctly when mixing named parameters and kwargs.

Page 3 of 5

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.