Accordion
Added accordion. This allows applying a series of wrangles to the elements of a list individually.
yml
["a","b","c"] -> ["A","B","C"]
wrangles:
- accordion:
input: list_column
output: modified_lists
wrangles:
- convert.case:
input: list_column
output: modified_lists
case: upper
Other Changes
- Added *order_by* parameter for read and write connectors using SQL syntax. e.g. Col1, Col2 DESC
yml
write:
- file:
name: file.csv
order_by: Col1, Col2 DESC
- split.text
- Improvements to output slicing. Can use step, is more tolerant of different syntax, and can use slicing when outputting to columns.
- No longer requires output. If output is omitted, the input column will be overwritten in line with other wrangles.
yml
- split.text:
input: column
element: ':3'
- *select.element*
- Allow slicing lists or strings.
yml
- select.element:
input: column[1:3]
- Make default behaviour to raise an error if a default isn't set.
- *split.dictionary*
- Use output to choose only specific keys from the dictionaries either by name, using a wildcard or with regex.
yml
- split.dictionary:
input: col1
output: Out*
- Allow output to use the syntax key: output_column_name to rename the resulting columns.
- Add ability to rename columns dynamically using a wildcard (*) or regex.
yml
- split.dictionary:
input: col1
output:
- "*": "*_SUFFIX"
- *select.dictionary_element*
- Allow specifying multiple elements. If a list of elements is provided, the output will be a dictionary rather than a scalar value.
yml
{'key1': 'value1', 'key2': 'value2', ...} -> {'key1': 'value1', 'key2': 'value2'}
- select.dictionary_element:
input: column
element:
- key1
- key2
- Allow element selection to be dynamic with wildcards or regex.
yml
{'key1': 'value1', 'key2': 'value2', ...} -> {'key1': 'value1', 'key2': 'value2'}
- select.dictionary_element:
input: column
element:
- key*
- Allow renaming output keys.
yml
{'key1': 'value1', 'key2': 'value2', ...} -> {'key1': 'value1', 'renamed_key2': 'value2'}
- select.dictionary_element:
input: column
element:
- key1
- key2: renamed_key2
- Allow using a dict for default to set the default for different keys.
yml
{'key1': 'value1'} -> {'key1': 'value1', 'key2': 'A', 'key3': 'B'}
- select.dictionary_element:
input: column
element:
- key1
- key2
- key3
default:
key2: A
key3: B
- *select.left*/*select.right*
- Enable integer lengths to work even if set as a string i.e. '1' behaves as 1.
- Allow negative values to remove characters from the left/right respectively.
- *create.embeddings*
- Give a clear error message if the API Key is missing or invalid.
- Set the default model to 'text-embedding-3-small'.
- SFTP Connector
- Reuse the connection when transferring multiple files and ensure the connection is closed properly.
- Add the filename to the error message if the file is not found when attempting to read.
- HTTP connector:
- Added write function to the connector.
- Added an option to do a pre-request for oauth authentication
- Added an orient parameter to define the JSON body structure.
- Pass through kwargs to the request.
- Enable *extract.custom* to work with ai variants.
- Ensure *similarity* outputs a python float.
- Add bcc parameter for *notification.email*.
- Improve the logic for where by filtering the original dataframe using the index to reduce issues with subtle data issues from executing the SQL query.
- Bugfix: Don't try to make a directory when writing a file to memory using the file connector.
- Provide clearer and more concise error messages for custom functions.
- Minor tidying of warnings from string escaping.
- Remove use of inplace due to upcoming pandas behaviour changes.
- Allow batching logic to deal with pandas tight dataframe dict format.
- Preparatory edits for releasing lookup wrangles. Not yet widely available.
- Added a devcontainer config for codespaces.