Wrangles

Latest version: v1.10.2

Safety actively analyzes 625726 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 5

1.10.2

- Don't sort keys by default when converting to YAML.
- Ensure *create.embedding* retries works even in the case of full network errors.

1.10.1

- Allow *split.list* to also work with JSON arrays.
- Allow *select.list_element* to also work with JSON arrays.
- Allow *select.dictionary_element* to also work with JSON objects.
- Allow *accordion* to also work with JSON arrays.
- Bump docker/login-action version to v3

1.10.0

Accordion
Added accordion. This allows applying a series of wrangles to the elements of a list individually.
yml
["a","b","c"] -> ["A","B","C"]
wrangles:
- accordion:
input: list_column
output: modified_lists
wrangles:
- convert.case:
input: list_column
output: modified_lists
case: upper


Other Changes
- Added *order_by* parameter for read and write connectors using SQL syntax. e.g. Col1, Col2 DESC
yml
write:
- file:
name: file.csv
order_by: Col1, Col2 DESC

- split.text
- Improvements to output slicing. Can use step, is more tolerant of different syntax, and can use slicing when outputting to columns.
- No longer requires output. If output is omitted, the input column will be overwritten in line with other wrangles.
yml
- split.text:
input: column
element: ':3'

- *select.element*
- Allow slicing lists or strings.
yml
- select.element:
input: column[1:3]

- Make default behaviour to raise an error if a default isn't set.
- *split.dictionary*
- Use output to choose only specific keys from the dictionaries either by name, using a wildcard or with regex.
yml
- split.dictionary:
input: col1
output: Out*

- Allow output to use the syntax key: output_column_name to rename the resulting columns.
- Add ability to rename columns dynamically using a wildcard (*) or regex.
yml
- split.dictionary:
input: col1
output:
- "*": "*_SUFFIX"

- *select.dictionary_element*
- Allow specifying multiple elements. If a list of elements is provided, the output will be a dictionary rather than a scalar value.
yml
{'key1': 'value1', 'key2': 'value2', ...} -> {'key1': 'value1', 'key2': 'value2'}
- select.dictionary_element:
input: column
element:
- key1
- key2

- Allow element selection to be dynamic with wildcards or regex.
yml
{'key1': 'value1', 'key2': 'value2', ...} -> {'key1': 'value1', 'key2': 'value2'}
- select.dictionary_element:
input: column
element:
- key*

- Allow renaming output keys.
yml
{'key1': 'value1', 'key2': 'value2', ...} -> {'key1': 'value1', 'renamed_key2': 'value2'}
- select.dictionary_element:
input: column
element:
- key1
- key2: renamed_key2

- Allow using a dict for default to set the default for different keys.
yml
{'key1': 'value1'} -> {'key1': 'value1', 'key2': 'A', 'key3': 'B'}
- select.dictionary_element:
input: column
element:
- key1
- key2
- key3
default:
key2: A
key3: B

- *select.left*/*select.right*
- Enable integer lengths to work even if set as a string i.e. '1' behaves as 1.
- Allow negative values to remove characters from the left/right respectively.
- *create.embeddings*
- Give a clear error message if the API Key is missing or invalid.
- Set the default model to 'text-embedding-3-small'.
- SFTP Connector
- Reuse the connection when transferring multiple files and ensure the connection is closed properly.
- Add the filename to the error message if the file is not found when attempting to read.
- HTTP connector:
- Added write function to the connector.
- Added an option to do a pre-request for oauth authentication
- Added an orient parameter to define the JSON body structure.
- Pass through kwargs to the request.
- Enable *extract.custom* to work with ai variants.
- Ensure *similarity* outputs a python float.
- Add bcc parameter for *notification.email*.
- Improve the logic for where by filtering the original dataframe using the index to reduce issues with subtle data issues from executing the SQL query.
- Bugfix: Don't try to make a directory when writing a file to memory using the file connector.
- Provide clearer and more concise error messages for custom functions.
- Minor tidying of warnings from string escaping.
- Remove use of inplace due to upcoming pandas behaviour changes.
- Allow batching logic to deal with pandas tight dataframe dict format.
- Preparatory edits for releasing lookup wrangles. Not yet widely available.
- Added a devcontainer config for codespaces.

1.9.0

- Enable _extract_raw_ option for extract.custom.
- Make OpenAI tests more tolerant of variations in model results.
- Add run function to SFTP connector.
- Add inclusive option to split.text to toggle whether to include the split character in the output or not.
- Refactor and optimize split.text function.
- Validate input and output lengths in pandas.copy.
- Added automated recipe schema generation and tests.

1.8.1

- Bugfix: fix issues caused for sql read/write with possible incompatible sqlalchemy/pandas versions.
- Enable support and tests for python 3.12

1.8.0

- _extract.ai_
- Pass through any additional unspecified parameters to the backend API.
- Assume outputs of type array without a child type defined should be strings.
- Added the ability to include header level messages.
- Improved error handling. Clearer errors for invalid schema or API keys.
- Allow the URL to be overridden.
- _create.embeddings_
- Pass through any additional unspecified parameters to the backend API.
- Allow the URL to be overridden.
- Bugfix: Exponential backoff for retries was not working.
- Added model parameter to schema.
- _merge.concatenate_
- Fix errors caused by non-string values.
- Add skip_empty parameter to skip empty values.
- Allow wrangles to be used to _rename_ columns.
- Added by parameter for _create.index_. This will create a sequential index grouped by the defined columns.
- Added a default parameter for _convert.from_json_ and _convert.from_yaml_ in the case of an empty of erroneous input.
- Added a handler for more data types for _convert.to_json_ including datetimes, numpy arrays and numpy floats/ints.
- Added _select.sample_ - the rows can be specified as whole numbers or a decimal. 5 = 5 rows, 0.5 = 50% of rows.
- Make start or length optional for _select.substring_.
- Added the ability to reference custom functions to set the value of a variable.
- Added a _sort_ wrangle.
- Bugfix: Fixed the schema definition for the wrangle _recipe_.
- Added the name of the recipe to the log entry.
- Add the ability to log info, warning or error messages with the _log_ wrangle.
- Fixed pytest on v7.4.4 due to breaking changes in 8.0.0
- Allow the value for _create.column_ to be a list or other complex object.
- Added use_spellcheck parameter to _extract.custom_.

Page 1 of 5

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.