Unitxt

Latest version: v1.18.0

Safety actively analyzes 706259 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 6 of 10

1.8.0

What's Changed

In this release, the main improvement focuses on introducing type checking within Unitxt tasks. Tasks are fundamental to the Unitxt protocol, acting as standardized blueprints for those integrating new datasets into Unitxt. They facilitate the use of task-specific templates and metrics. To guarantee precise dataset processing in line with the task schema, we've introduced explicit types to the task fields.

For example, consider the NER task in Unitxt, previously defined as follows:
python
add_to_catalog(
FormTask(
inputs=["text", "entity_types"],
outputs=["spans_starts", "spans_ends", "text", "labels"],
metrics=["metrics.ner"],
),
"tasks.ner",
)

Now, the NER task definition includes explicit types:
python
add_to_catalog(
FormTask(
inputs={"text": "str", "entity_types": "List[str]"},
outputs={
"spans_starts": "List[int]",
"spans_ends": "List[int]",
"text": "List[str]",
"labels": "List[str]",
},
prediction_type="List[Tuple[str,str]]",
metrics=["metrics.ner"],
),
"tasks.ner",
)


This enhancement aligns with Unitxt's goal that definitions should be easily understandable and capable of facilitating validation processes with appropriate error messages to guide developers in identifying and solving issues.

Right now , using the original definition format without typing , will continue to work but generate a warning message. You should begin to adapt your tasks definition by adding types.


'inputs' field of Task should be a dictionary of field names and their types. For example, {'text': 'str', 'classes': 'List[str]'}. Instead only '['question', 'question_id', 'topic']' was passed. All types will be assumed to be 'Any'. In future version of unitxt this will raise an exception.
'outputs' field of Task should be a dictionary of field names and their types. For example, {'text': 'str', 'classes': 'List[str]'}. Instead only '['reference_answers', 'reference_contexts', 'reference_context_ids', 'is_answerable_label']' was passed. All types will be assumed to be 'Any'. In future version of unitxt this will raise an exception.


Special thanks to pawelknes who implemented this important feature. It truly demonstrates the collective power of the Unitxt community and the invaluable contributions made by Unitxt users beyond the core development team. Such contributions are highly appreciated and encouraged.

* For more detailed information, please refer to https://github.com/IBM/unitxt/pull/710

Breaking Changes

"metrics.spearman", "metrics.kendalltau_b", "metrics.roc_auc": prediction type is float.
"metrics.f1_binary","metrics.accuracy_binary", "metrics.precision_binary", "metrics.recall_binary", "metrics.max_f1_binary", "metrics.max_accuracy_binary": prediction type is Union[float, int], references must be equal to 0 or 1

Bug Fixes
* Set empty list if preprocess_steps is None by marukaz in https://github.com/IBM/unitxt/pull/780
* Fix UI load failure due to typo by yoavkatz in https://github.com/IBM/unitxt/pull/785
* Fix huggingface uploads by elronbandel in https://github.com/IBM/unitxt/pull/793
* Fix typo in error message by marukaz in https://github.com/IBM/unitxt/pull/777

New Assets
* add perplexity with Mistral model by lilacheden in https://github.com/IBM/unitxt/pull/713

New Features
* Type checking for task definition by pawelknes in https://github.com/IBM/unitxt/pull/710
* Add open and ibm_genai to llm as judge inference engine by OfirArviv in https://github.com/IBM/unitxt/pull/782
* Add negative class score for binary precision, recall, f1 and max f1 by lilacheden in https://github.com/IBM/unitxt/pull/788
1. Add negative class score for binary precision, recall, f1 and max f1, e.g. f1_binary now returns also "f1_binary_neg".
2. Support Unions in metric prediction_type
3. Add processor cast_to_float_return_nan_if_failed
4. Breaking change: Make prediction_type of metrics numeric:
A. "metrics.kendalltau_b", "metrics.roc_auc": prediction type is float.
B. "metrics.f1_binary","metrics.accuracy_binary", "metrics.precision_binary", "metrics.recall_binary", "metrics.max_f1_binary", "metrics.max_accuracy_binary": prediction type is Union[float, int], references must be equal to 0 or 1
* Group shuffle by sam-data-guy-iam in https://github.com/IBM/unitxt/pull/639

Documentation
* Fix a small typo by dafnapension in https://github.com/IBM/unitxt/pull/779
* Update instructions to install HELM from PyPI by yifanmai in https://github.com/IBM/unitxt/pull/783
* Update few-shot instructions in Unitxt with HELM by yifanmai in https://github.com/IBM/unitxt/pull/774


**Full Changelog**: https://github.com/IBM/unitxt/compare/1.7.7...1.8.0

**Full Changelog**: https://github.com/IBM/unitxt/compare/1.8.1...1.8.0

1.7.9

What's Changed
* Set empty list if preprocess_steps is None by marukaz in https://github.com/IBM/unitxt/pull/780
* fix a small typo by dafnapension in https://github.com/IBM/unitxt/pull/779
* Fix typo by marukaz in https://github.com/IBM/unitxt/pull/777
* Group shuffle by sam-data-guy-iam in https://github.com/IBM/unitxt/pull/639
* add perplexity with Mistral model by lilacheden in https://github.com/IBM/unitxt/pull/713
* Fix UI load failure due to typo by yoavkatz in https://github.com/IBM/unitxt/pull/785
* Type checking for task definition by pawelknes in https://github.com/IBM/unitxt/pull/710
* Add open and ibm_genai to llm as judge inference engine by OfirArviv in https://github.com/IBM/unitxt/pull/782
* Avoid creating a demo pool if num_demos is 0. by yoavkatz in https://github.com/IBM/unitxt/pull/787
* Update test_helm.yml by elronbandel in https://github.com/IBM/unitxt/pull/789
* Update instructions to install HELM from PyPI by yifanmai in https://github.com/IBM/unitxt/pull/783
* Update few-shot instructions in Unitxt with HELM by yifanmai in https://github.com/IBM/unitxt/pull/774
* Update version to 1.7.8 by elronbandel in https://github.com/IBM/unitxt/pull/790
* Fix huggingface uploads by elronbandel in https://github.com/IBM/unitxt/pull/793
* Update version to 1.7.9 by elronbandel in https://github.com/IBM/unitxt/pull/794


**Full Changelog**: https://github.com/IBM/unitxt/compare/1.7.7...1.7.9

1.7.8

What's Changed
* Set empty list if preprocess_steps is None by marukaz in https://github.com/IBM/unitxt/pull/780
* fix a small typo by dafnapension in https://github.com/IBM/unitxt/pull/779
* Fix typo by marukaz in https://github.com/IBM/unitxt/pull/777
* Group shuffle by sam-data-guy-iam in https://github.com/IBM/unitxt/pull/639
* add perplexity with Mistral model by lilacheden in https://github.com/IBM/unitxt/pull/713
* Fix UI load failure due to typo by yoavkatz in https://github.com/IBM/unitxt/pull/785
* Type checking for task definition by pawelknes in https://github.com/IBM/unitxt/pull/710
* Add open and ibm_genai to llm as judge inference engine by OfirArviv in https://github.com/IBM/unitxt/pull/782
* Avoid creating a demo pool if num_demos is 0. by yoavkatz in https://github.com/IBM/unitxt/pull/787
* Update test_helm.yml by elronbandel in https://github.com/IBM/unitxt/pull/789
* Update instructions to install HELM from PyPI by yifanmai in https://github.com/IBM/unitxt/pull/783
* Update few-shot instructions in Unitxt with HELM by yifanmai in https://github.com/IBM/unitxt/pull/774
* Update version to 1.7.8 by elronbandel in https://github.com/IBM/unitxt/pull/790


**Full Changelog**: https://github.com/IBM/unitxt/compare/1.7.7...1.7.8

1.7.7

What's Changed
* adding multi-lingual bert score model by assaftibm in https://github.com/IBM/unitxt/pull/755
* Add HELM Integration: Guide, Examples and Tests by elronbandel in https://github.com/IBM/unitxt/pull/743
* Add production-time recipe processing capability to unitxt by elronbandel in https://github.com/IBM/unitxt/pull/739
* Add tags and descriptions for assets on the website by elronbandel in https://github.com/IBM/unitxt/pull/760
* Changed HELM integration docs to point to point to output result file by yoavkatz in https://github.com/IBM/unitxt/pull/761
* Allow FilterByCondition to condition also on subfields by dafnapension in https://github.com/IBM/unitxt/pull/762
* fix a small bug in BinaryMaxAccuracy by dafnapension in https://github.com/IBM/unitxt/pull/757
* Fix Reward metric warnings by assaftibm in https://github.com/IBM/unitxt/pull/765
* Added post processor to take first line in quantization templates by yoavkatz in https://github.com/IBM/unitxt/pull/770
* Support for parsing all strings representing valid Python type hints by pawelknes in https://github.com/IBM/unitxt/pull/754
* simplify bitwiseor-to-union and show a scheme for Literal by dafnapension in https://github.com/IBM/unitxt/pull/772
* Adding NLI model via perplexity by assaftibm in https://github.com/IBM/unitxt/pull/766
* Implement LLM as judge metrics by eladven in https://github.com/IBM/unitxt/pull/771
* Return loading step to enforce loader limit. by yoavkatz in https://github.com/IBM/unitxt/pull/775
* Update formats by elronbandel in https://github.com/IBM/unitxt/pull/769


**Full Changelog**: https://github.com/IBM/unitxt/compare/1.7.6...1.7.7

1.7.6

What's Changed
The most significat change in this release is the addition of the notion of `\N` (slash capital N) to formats. With `\N` you can define places where you want a single new line removing all newlines ahead.

A very detailed explanation if you want to go deeper:
> The Capital New Line Notation (\N) transforms a given string by applying the Capital New Line Notation.
The Capital New Line Notation (\N) is designed to manage newline behavior in a string efficiently.
This custom notation aims to consolidate multiple newline characters (\n) into a single newline under
specific conditions, with tailored handling based on whether there's preceding text. The function
distinguishes between two primary scenarios:
1. If there's text (referred to as a prefix) followed by any number of \n characters and then one or
more \N, the entire sequence is replaced with a single \n. This effectively simplifies multiple
newlines and notation characters into a single newline when there's preceding text.
2. If the string starts with \n characters followed by \N without any text before this sequence, or if
\N is at the very beginning of the string, the sequence is completely removed. This case is
applicable when the notation should not introduce any newlines due to the absence of preceding text.


This allows us two things:
First define system formats that are not having unnecassry new lines when instruciton of system prompt are missing.
Second, to ignore any new lines created by the template ensuring the number of new lines will be set by the format only.

For example if we defined the system format in the following way:
python
from unitxt.formats import SystemFormat

format = SystemFormat(model_input_format="{system_prompt}\n{instruction}\n|user|\n{source}\n|assistant|\n{target_prefix}")

We faced two issues:
1. If the system prompt is empty or the instruction is empty we have two trailing new lines for no reason.
2. If the source finished with new line (mostly due to template structre) we would have unnecassry empty line before the "|user|"

Both problems are solved with \N notation:
python
from unitxt.formats import SystemFormat

format = SystemFormat(model_input_format="{system_prompt}\\N{instruction}\\N|user|\n{source}\\N|assistant|\n{target_prefix}")


Breaking changes

* Fix typo in MultipleChoiceTemplate field choices_seperator -> choices_separator
* Deprecation of use_query option in all operators , for now it is just raising warning but will be removed in the next major release. The new default behavior is equivalent to use_query=True.

All Changes

Bug Fixes:
* Fix error in unitxt versions conflict and improve message by elronbandel in https://github.com/IBM/unitxt/pull/730
* Fix wrong handling of list in dict_get by yoavkatz in https://github.com/IBM/unitxt/pull/733
* Fix classification datasets with wrong schema by elronbandel in https://github.com/IBM/unitxt/pull/735
* Fix codespell by elronbandel in https://github.com/IBM/unitxt/pull/742
* Fix UI errors cause by grammar tasks by elronbandel in https://github.com/IBM/unitxt/pull/750
* Fix src layout and enforce its rules with pre-commit hooks by elronbandel in https://github.com/IBM/unitxt/pull/753

Assets Fixes:
* Rename to correct model name by eladven in https://github.com/IBM/unitxt/pull/729

New Features:
* Add notion of \N to formats, to fix format new line clashes by elronbandel in https://github.com/IBM/unitxt/pull/751
* Ability to dynamically change InstanceMetric inputs + grammar metrics by arielge in https://github.com/IBM/unitxt/pull/736
* Add DeprecatedFIeld for more informative procedure for deprecating fields of artifacts by dafnapension in https://github.com/IBM/unitxt/pull/741

New Assets:
* Add rerank recall metric to unitxt by jlqibm in https://github.com/IBM/unitxt/pull/662
* Add many selection and human preference tasks and datasets by elronbandel in https://github.com/IBM/unitxt/pull/746
* Adding Detector metric for running any classifier from huggingface as a metric by mnagired in https://github.com/IBM/unitxt/pull/745
* Add operators: RegexSplit, TokensSplit, Chunk by elronbandel in https://github.com/IBM/unitxt/pull/749
* Add bert score large and base versions by assaftibm in https://github.com/IBM/unitxt/pull/748

Enhancments:
* Remove use_dpath parameter from dict_get and dict_set by dafnapension in https://github.com/IBM/unitxt/pull/727
* Add mock judge test to cohere for ai by perlitz in https://github.com/IBM/unitxt/pull/720

New Contributors
* mnagired made their first contribution in https://github.com/IBM/unitxt/pull/745

**Full Changelog**: https://github.com/IBM/unitxt/compare/1.7.4...1.7.6

1.7.4

Breaking Changes
* Add generic mechanism to check prediction and reference types in metrics by yoavkatz in https://github.com/IBM/unitxt/pull/667 See explaination in the previoues sections for why this change is breaking.

New Features
* Add ability to fuse sources with disjoint splits by yoavkatz in https://github.com/IBM/unitxt/pull/707
* Allow max reduction type in metric to find the best overall score over all instances by yoavkatz in https://github.com/IBM/unitxt/pull/709
* Add string operators module with many standard string operaotrs by elronbandel in https://github.com/IBM/unitxt/pull/721
* Allow disabling per group f1 scores in customF1 by yoavkatz in https://github.com/IBM/unitxt/pull/719
* Add improved type inference capabilities, inferring type_string from a given object, and infer_type therefrom via parse_type_string by dafnapension in https://github.com/IBM/unitxt/pull/706
* Add description and tags to every catalog artifact by elronbandel in https://github.com/IBM/unitxt/pull/725
* allow contexts not to be entered to metric by perlitz in https://github.com/IBM/unitxt/pull/653
* Add control over metrics and postprocessors through the recipe by elronbandel in https://github.com/IBM/unitxt/pull/663
* Add coqa and dialog processing capabilites by elronbandel in https://github.com/IBM/unitxt/pull/640
* Add pandas_load_args for LoadCSV by elronbandel in https://github.com/IBM/unitxt/pull/696
* Add safe and complete type parsing function to type_utils, for allowing better type checking. by elronbandel in https://github.com/IBM/unitxt/pull/688
* Add deprecation decorator for warning and errors for deprecation of functions and classes by elronbandel in https://github.com/IBM/unitxt/pull/689
* Add choices shuffling to MultipleChoiceTemplate by elronbandel in https://github.com/IBM/unitxt/pull/678
* Make settings utils type sensetive by elronbandel in https://github.com/IBM/unitxt/pull/674

New Assets
* Add intl to korean and arabic + improved packaged dependency checks by pklpriv in https://github.com/IBM/unitxt/pull/698
* Added BERT Score with new embedding model "distilbert-base-uncased" by shivangibithel in https://github.com/IBM/unitxt/pull/703
* Grammatical error correction task by arielge in https://github.com/IBM/unitxt/pull/718
* Add trec dataset by elronbandel in https://github.com/IBM/unitxt/pull/723
* Add templates for flan text similarity by elronbandel in https://github.com/IBM/unitxt/pull/728
* Add metrics for binary tasks with float predictions by lilacheden in https://github.com/IBM/unitxt/pull/654
* Add mistral format by elronbandel in https://github.com/IBM/unitxt/pull/660
* Added new metric for unsorted_list_exact_math by yoavkatz in https://github.com/IBM/unitxt/pull/685
* Add flan wnli truthfulness format by elronbandel in https://github.com/IBM/unitxt/pull/665
* DuplicateInstances operator by pawelknes in https://github.com/IBM/unitxt/pull/682
* introduce arabic to normalized sacrebleu by pklpriv in https://github.com/IBM/unitxt/pull/638
* 20newsgroup from sklearn by ilyashnil in https://github.com/IBM/unitxt/pull/659
* Add match_closest_option post processor for multiple choice qa by elronbandel in https://github.com/IBM/unitxt/pull/679
* Duplicate instance operator - new functionality by pawelknes in https://github.com/IBM/unitxt/pull/687
* Add babi qa dataset by elronbandel in https://github.com/IBM/unitxt/pull/666

Asset Fixes
* Add missing instruction in labrador zero shot format by alonh in https://github.com/IBM/unitxt/pull/716
* Fix title template for classification by elronbandel in https://github.com/IBM/unitxt/pull/722
* prevent cohere4ai using judge as default by perlitz in https://github.com/IBM/unitxt/pull/664
* fix summarization template by gitMichal in https://github.com/IBM/unitxt/pull/652

Bug Fixes
* Fix handling of boolean environment variables by arielge in https://github.com/IBM/unitxt/pull/711
* Handle all env variables with expected types by arielge in https://github.com/IBM/unitxt/pull/714
* Properly define the abstract fields by elronbandel in https://github.com/IBM/unitxt/pull/724
* Fix places not using general settings or logger by elronbandel in https://github.com/IBM/unitxt/pull/656
* removal of dpath -- ready for review by dafnapension in https://github.com/IBM/unitxt/pull/680
* fix: LoadFromIBMCloud empty data_dir breaks processing by jezekra1 in https://github.com/IBM/unitxt/pull/668
* Fix bug in references with none by elronbandel in https://github.com/IBM/unitxt/pull/677
* Validating that the prepare dir is consistent with catalog by eladven in https://github.com/IBM/unitxt/pull/683

New Contributors
* shivangibithel made their first contribution in https://github.com/IBM/unitxt/pull/703
* jezekra1 made their first contribution in https://github.com/IBM/unitxt/pull/668
* pklpriv made their first contribution in https://github.com/IBM/unitxt/pull/638
* pawelknes made their first contribution in https://github.com/IBM/unitxt/pull/682

**Full Changelog**: https://github.com/IBM/unitxt/compare/1.7.1...1.7.4

Page 6 of 10

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.