Unitxt

Latest version: v1.21.0

Safety actively analyzes 723144 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 10

1.21.0

What's Changed
* add 'show more' button for imports from unitxt modules by dafnapension in https://github.com/IBM/unitxt/pull/1651
* Update head qa dataset by elronbandel in https://github.com/IBM/unitxt/pull/1658
* Update few slow datasets by elronbandel in https://github.com/IBM/unitxt/pull/1663
* MLCommons AILuminate card and related artifacts by bnayahu in https://github.com/IBM/unitxt/pull/1662
* Granite guardian: add raw prompt to the result by martinscooper in https://github.com/IBM/unitxt/pull/1671
* Add positional bias summary to the response by martinscooper in https://github.com/IBM/unitxt/pull/1640
* Return float instead float32 in granite guardian metric by martinscooper in https://github.com/IBM/unitxt/pull/1669
* add qa template exact output by OfirArviv in https://github.com/IBM/unitxt/pull/1674
* LLM Judge: add prompts to the result by default by martinscooper in https://github.com/IBM/unitxt/pull/1670
* Safety eval updates by bnayahu in https://github.com/IBM/unitxt/pull/1668
* Add inference engine caching by eladven in https://github.com/IBM/unitxt/pull/1645
* BugFix: Handle cases where all sample scores are the same (yields nan) by elronbandel in https://github.com/IBM/unitxt/pull/1660
* CrossInferenceProvider: add more models by martinscooper in https://github.com/IBM/unitxt/pull/1676
* Implement get_engine_id were missing by martinscooper in https://github.com/IBM/unitxt/pull/1679
* Revisit base dependencies (specifically remove ipadic and absl-py) by elronbandel in https://github.com/IBM/unitxt/pull/1681
* Fix LoadHF.load_dataset() when mem-caching is off by yhwang in https://github.com/IBM/unitxt/pull/1683
* HFPipelineInferenceEngine - add loaded tokenizer to pipeline by eladven in https://github.com/IBM/unitxt/pull/1677
* Add default cache folder to .gitgnore by martinscooper in https://github.com/IBM/unitxt/pull/1687
* Fix a bug in loading without trust remote code by elronbandel in https://github.com/IBM/unitxt/pull/1684
* Add sacrebleu[ja] to test dependencies by elronbandel in https://github.com/IBM/unitxt/pull/1685
* Let evaluator name to be a string by martinscooper in https://github.com/IBM/unitxt/pull/1665
* Fix: AzureOpenAIInferenceEngine fails if api_version is not set by martinscooper in https://github.com/IBM/unitxt/pull/1680
* Fix some bugs in inference engine tests by elronbandel in https://github.com/IBM/unitxt/pull/1682
* Improved output message when using inference cache by yoavkatz in https://github.com/IBM/unitxt/pull/1686
* Changed API of Key Value Extraction task to use Dict and not List[Tuple] (NON BACKWARD COMPATIBLE CHANGE) by yoavkatz in https://github.com/IBM/unitxt/pull/1675
* Support for asynchronous requests for watsonx.ai chat by pawelknes in https://github.com/IBM/unitxt/pull/1666
* add tags information - url by BenjSz in https://github.com/IBM/unitxt/pull/1691
* Fixes to GraniteGuardian metric,, safety evals cleanups by bnayahu in https://github.com/IBM/unitxt/pull/1690
* Add docstring to LLMJudge classes by martinscooper in https://github.com/IBM/unitxt/pull/1652
* Remove src.lock by elronbandel in https://github.com/IBM/unitxt/pull/1692
* Text2sql metrics update and optional caching by oktie in https://github.com/IBM/unitxt/pull/1672
* Llm judge use cross provider by martinscooper in https://github.com/IBM/unitxt/pull/1673
* Improve LLM as Judge consistency by martinscooper in https://github.com/IBM/unitxt/pull/1688
* Update version to 1.21.0 by elronbandel in https://github.com/IBM/unitxt/pull/1693

New Contributors
* yhwang made their first contribution in https://github.com/IBM/unitxt/pull/1683

**Full Changelog**: https://github.com/IBM/unitxt/compare/1.20.0...1.21.0

1.20.0

What's Changed
* Fix unnecessary attempts in LoadCSV by elronbandel in https://github.com/IBM/unitxt/pull/1630
* Fix LLM as Judge direct criteria typo by martinscooper in https://github.com/IBM/unitxt/pull/1631
* Fix of typo in usage of attributes inside IntersectCorrespondingFields by pklpriv in https://github.com/IBM/unitxt/pull/1637
* Added MILU and Indic BoolQ Support by murthyrudra in https://github.com/IBM/unitxt/pull/1639
* Vision bench by alfassy in https://github.com/IBM/unitxt/pull/1641
* Add Granite Guardian evaluation on HF example by martinscooper in https://github.com/IBM/unitxt/pull/1638
* present catalog entries as pieces of python code by dafnapension in https://github.com/IBM/unitxt/pull/1643
* Example for evaluating system message leakage by elronbandel in https://github.com/IBM/unitxt/pull/1609
* Benjams/add hotpotqa + change type of metadata field to dict (non backward compatible) by BenjSz in https://github.com/IBM/unitxt/pull/1633
* removed the leftout break_point by dafnapension in https://github.com/IBM/unitxt/pull/1646
* Added Indic ARC Challenge Support by murthyrudra in https://github.com/IBM/unitxt/pull/1654
* Minor bug fix affecting Text2SQL execution accuracy by oktie in https://github.com/IBM/unitxt/pull/1657
* WMLInferenceEngineChat fixes by pawelknes in https://github.com/IBM/unitxt/pull/1656
* Update version to 1.20.0 by elronbandel in https://github.com/IBM/unitxt/pull/1659

New Contributors
* murthyrudra made their first contribution in https://github.com/IBM/unitxt/pull/1639

**Full Changelog**: https://github.com/IBM/unitxt/compare/1.19.0...1.20.0

1.19.0

What's Changed
* Add RagBench datasets by elronbandel in https://github.com/IBM/unitxt/pull/1580
* Fix prompts table benchmark by ShirApp in https://github.com/IBM/unitxt/pull/1581
* Fix attempt to missing arrow dataset by elronbandel in https://github.com/IBM/unitxt/pull/1582
* Wml comp by alfassy in https://github.com/IBM/unitxt/pull/1578
* Key value extraction improvements by yoavkatz in https://github.com/IBM/unitxt/pull/1573
* fix: minor bug when only space id is provided for WML inference by tsinggggg in https://github.com/IBM/unitxt/pull/1583
* Try fixing csv loader by elronbandel in https://github.com/IBM/unitxt/pull/1586
* Fix failing tests by elronbandel in https://github.com/IBM/unitxt/pull/1589
* Fix tests by elronbandel in https://github.com/IBM/unitxt/pull/1590
* Fix metrics formatting and style by elronbandel in https://github.com/IBM/unitxt/pull/1591
* Fix bird dataset by perlitz in https://github.com/IBM/unitxt/pull/1593
* Use Lazy Loaders by dafnapension in https://github.com/IBM/unitxt/pull/1536
* Fix loading without limit by elronbandel in https://github.com/IBM/unitxt/pull/1594
* [Breaking change] Add support for all Granite Guardian risks by martinscooper in https://github.com/IBM/unitxt/pull/1576
* Added api call example by yoavkatz in https://github.com/IBM/unitxt/pull/1587
* Make MultipleSourceLoader lazy and fix its use of fusion by elronbandel in https://github.com/IBM/unitxt/pull/1602
* Prioritize using default templates from card over task by elronbandel in https://github.com/IBM/unitxt/pull/1596
* Use faster model for examples by elronbandel in https://github.com/IBM/unitxt/pull/1607
* Add clear and minimal settings documentation by elronbandel in https://github.com/IBM/unitxt/pull/1606
* Fix some tests by elronbandel in https://github.com/IBM/unitxt/pull/1610
* Add download and etag timeout settings to workflow configurations by elronbandel in https://github.com/IBM/unitxt/pull/1613
* Allow read timeout error in preparation tests by elronbandel in https://github.com/IBM/unitxt/pull/1615
* Fix Ollama inference engine by eladven in https://github.com/IBM/unitxt/pull/1611
* Add verify as an option to LoadFromAPI by perlitz in https://github.com/IBM/unitxt/pull/1608
* Added example of custom metric by yoavkatz in https://github.com/IBM/unitxt/pull/1616
* Granite guardian minor changes by martinscooper in https://github.com/IBM/unitxt/pull/1605
* add ragbench faithfulness cards by lilacheden in https://github.com/IBM/unitxt/pull/1598
* Update tables benchmark name to torr by elronbandel in https://github.com/IBM/unitxt/pull/1617
* Add CoT to LLM as judge assessments by martinscooper in https://github.com/IBM/unitxt/pull/1612
* Simplify preparation tests with better error handling by elronbandel in https://github.com/IBM/unitxt/pull/1618
* Text2sql execution accuracy metric updates by oktie in https://github.com/IBM/unitxt/pull/1604
* Fix Azure OpenAI based LLM judges by martinscooper in https://github.com/IBM/unitxt/pull/1619
* Add correctness_based_on_ground_truth criteria by martinscooper in https://github.com/IBM/unitxt/pull/1623
* Enable offline mode for hugginface by using local pre-downloaded metrics, datasets and models by elronbandel in https://github.com/IBM/unitxt/pull/1603
* Add provider specific args and allow using unrecognized model names by elronbandel in https://github.com/IBM/unitxt/pull/1621
* Start implementing assesment for unitxt assitant by eladven in https://github.com/IBM/unitxt/pull/1625
* small changes to profiler by dafnapension in https://github.com/IBM/unitxt/pull/1627
* Return MultiStream in lazy loaders to avoid copying by elronbandel in https://github.com/IBM/unitxt/pull/1628

New Contributors
* tsinggggg made their first contribution in https://github.com/IBM/unitxt/pull/1583

**Full Changelog**: https://github.com/IBM/unitxt/compare/1.18.0...1.19.0

1.18.0

The main improvements in this version focus on **caching strategies, dataset loading, and speed optimizations**.

Hugging Face Datasets Caching Policy

We have **completely revised our caching policy** and how we handle Hugging Face datasets in order to improve performance.

1. **Hugging Face datasets are now cached by default.**

This means that LoadHF loader will cache the downloaded datasets in the HF cache directory (typically ~/.cache/huggingface/datasets).

- To disable this caching mechanism, use:
python
unitxt.settings.disable_hf_datasets_cache = True

2. **All Hugging Face datasets are first downloaded and then processed.**

- This means the entire dataset is downloaded, which is faster for most datasets. However, if you want to process a huge dataset, and the HF dataset supports streaming, you can load it **in streaming mode**

python
LoadHF(name="my-dataset", streaming=True)

3. **To enable streaming mode by default for all Hugging Face datasets, use:**
python
unitxt.settings.stream_hf_datasets_by_default = True

While the **new defaults (full download & caching)** may make the **initial dataset load slower**, subsequent loads will be **significantly faster**.

Unitxt Datasets Caching Policy

By default, when loading datasets with `unitxt.load_dataset`, the dataset is **prepared from scratch** each time you call the function.
This ensures that any changes made to the card definition are reflected in the output.

- This process may take a few seconds, and for **large datasets**, repeated loading can accumulate overhead.
- If you are using fixed datasets from the catalog, you can **enable caching** for Unitxt datasets and thus cache the unitxt datasets.
The datasets are cached in the huggingface cache (typically ~/.cache/huggingface/datasets).

python
from unitxt import load_dataset

ds = load_dataset(card="my_card", use_cache=True)

Faster Unitxt Dataset Preparation

To improve dataset loading speed, we have optimized how Unitxt datasets are prepared.

Background:
Unitxt datasets are converted to Hugging Face datasets because they store data on disk while keeping only the necessary parts in memory (via **PyArrow**). This enables efficient handling of large datasets without excessive memory usage.

Previously, `unitxt.load_dataset` used **built-in Hugging Face methods** for dataset preparation, which included **unnecessary type handling and verification**, slowing down the process.

Key improvements:
- We now **create the Hugging Face dataset directly**, reducing preparation time by **almost 50%**.
- With this optimization, **Unitxt datasets are now faster than ever!**

What's Changed
* End of year summary blog post by elronbandel in https://github.com/IBM/unitxt/pull/1530
* Updated documentation and examples of LLM-as-Judge by tejaswini in https://github.com/IBM/unitxt/pull/1532
* Eval assist documentation by tejaswini in https://github.com/IBM/unitxt/pull/1537
* Update notification banner styles and add 2024 summary blog link by elronbandel in https://github.com/IBM/unitxt/pull/1538
* Add more granite llm as judge artifacts by martinscooper in https://github.com/IBM/unitxt/pull/1516
* Fix Australian legal qa dataset by elronbandel in https://github.com/IBM/unitxt/pull/1542
* Set use 1 shot for wikitq in tables_benchmark by yifanmai in https://github.com/IBM/unitxt/pull/1541
* Bugfix: indexed row major serialization fails with None cell values by yifanmai in https://github.com/IBM/unitxt/pull/1540
* Solve issue of expired token in Unitxt Assistant by eladven in https://github.com/IBM/unitxt/pull/1543
* Add Replicate inference support by elronbandel in https://github.com/IBM/unitxt/pull/1544
* add a filter to wikitq by ShirApp in https://github.com/IBM/unitxt/pull/1547
* Add text2sql tasks by perlitz in https://github.com/IBM/unitxt/pull/1414
* Add deduplicate operator by elronbandel in https://github.com/IBM/unitxt/pull/1549
* Fix the authentication problem by eladven in https://github.com/IBM/unitxt/pull/1550
* Attach assitant answers to their origins with url link by elronbandel in https://github.com/IBM/unitxt/pull/1528
* Add mtrag benchmark by elronbandel in https://github.com/IBM/unitxt/pull/1548
* Update end of year summary blog by elronbandel in https://github.com/IBM/unitxt/pull/1552
* Add data classification policy to CrossProviderInferenceEngine initialization based on selected model by elronbandel in https://github.com/IBM/unitxt/pull/1539
* Fix recently broken rag metrics by elronbandel in https://github.com/IBM/unitxt/pull/1554
* Renamed criterias in LLM-as-a-Judge metrics to criteria - Breaking change by tejaswini in https://github.com/IBM/unitxt/pull/1545
* Finqa hash to top by elronbandel in https://github.com/IBM/unitxt/pull/1555
* Refactor safety metric to be faster and updated by elronbandel in https://github.com/IBM/unitxt/pull/1484
* Improve assistant by elronbandel in https://github.com/IBM/unitxt/pull/1556
* Feature/add global mmlu cards by eliyahabba in https://github.com/IBM/unitxt/pull/1561
* Add quality dataset by eliyahabba in https://github.com/IBM/unitxt/pull/1563
* Add CollateInstanceByField operator to group data by specific field by sarathsgvr in https://github.com/IBM/unitxt/pull/1546
* Fix prompts table benchmark by ShirApp in https://github.com/IBM/unitxt/pull/1565
* Create new IntersectCorrespondingFields operator by pklpriv in https://github.com/IBM/unitxt/pull/1531
* Add granite documents format by elronbandel in https://github.com/IBM/unitxt/pull/1566
* Revisit huggingface cache policy - BREAKING CHANGE by elronbandel in https://github.com/IBM/unitxt/pull/1564
* Add global mmlu lite sensitivity cards by eliyahabba in https://github.com/IBM/unitxt/pull/1568
* Add schema-linking by KyleErwin in https://github.com/IBM/unitxt/pull/1533
* fix the printout of empty strings in the yaml cards of the catalog by dafnapension in https://github.com/IBM/unitxt/pull/1567
* Use repr instead of to_json for unitxt dataset caching by elronbandel in https://github.com/IBM/unitxt/pull/1570
* Added key value extraction evaluation and example with images by yoavkatz in https://github.com/IBM/unitxt/pull/1529

New Contributors
* tejaswini made their first contribution in https://github.com/IBM/unitxt/pull/1532
* KyleErwin made their first contribution in https://github.com/IBM/unitxt/pull/1533

**Full Changelog**: https://github.com/IBM/unitxt/compare/1.17.0...1.18.0

1.17.2

What's Changed
* Feature/add global mmlu cards by eliyahabba in https://github.com/IBM/unitxt/pull/1561
* Add quality dataset by eliyahabba in https://github.com/IBM/unitxt/pull/1563
* Add CollateInstanceByField operator to group data by specific field by sarathsgvr in https://github.com/IBM/unitxt/pull/1546
* Fix prompts table benchmark by ShirApp in https://github.com/IBM/unitxt/pull/1565
* Create new IntersectCorrespondingFields operator by pklpriv in https://github.com/IBM/unitxt/pull/1531
* Add granite documents format by elronbandel in https://github.com/IBM/unitxt/pull/1566
* Revisit huggingface cache policy by elronbandel in https://github.com/IBM/unitxt/pull/1564
* Add global mmlu lite sensitivity cards by eliyahabba in https://github.com/IBM/unitxt/pull/1568
* Update version to 1.17.2 by elronbandel in https://github.com/IBM/unitxt/pull/1569

**Full Changelog**: https://github.com/IBM/unitxt/compare/1.17.1...1.17.2

1.17.1

What's Changed

Non backward compatible change

* Renamed criterias in LLM-as-a-Judge metrics to criteria - Breaking change by tejaswini in https://github.com/IBM/unitxt/pull/1545

New features

* Add Replicate inference support by elronbandel in https://github.com/IBM/unitxt/pull/1544
* Add text2sql tasks by perlitz in https://github.com/IBM/unitxt/pull/1414
* Add deduplicate operator by elronbandel in https://github.com/IBM/unitxt/pull/1549

New Assets
* Add more granite llm as judge artifacts by martinscooper in https://github.com/IBM/unitxt/pull/1516
* Add mtrag benchmark by elronbandel in https://github.com/IBM/unitxt/pull/1548

Documentation
* End of year summary blog post by elronbandel in https://github.com/IBM/unitxt/pull/1530
* Update notification banner styles and add 2024 summary blog link by elronbandel in https://github.com/IBM/unitxt/pull/1538
* Updated documentation and examples of LLM-as-Judge by tejaswini in https://github.com/IBM/unitxt/pull/1532
* Eval assist documentation by tejaswini in https://github.com/IBM/unitxt/pull/1537

Bug Fixes
* Fix Australian legal qa dataset by elronbandel in https://github.com/IBM/unitxt/pull/1542
* Set use 1 shot for wikitq in tables_benchmark by yifanmai in https://github.com/IBM/unitxt/pull/1541
* Bugfix: indexed row major serialization fails with None cell values by yifanmai in https://github.com/IBM/unitxt/pull/1540
* Solve issue of expired token in Unitxt Assistant by eladven in https://github.com/IBM/unitxt/pull/1543
* add a filter to wikitq by ShirApp in https://github.com/IBM/unitxt/pull/1547
* Fix the authentication problem by eladven in https://github.com/IBM/unitxt/pull/1550
* Attach assitant answers to their origins with url link by elronbandel in https://github.com/IBM/unitxt/pull/1528
* Update end of year summary blog by elronbandel in https://github.com/IBM/unitxt/pull/1552
* Add data classification policy to CrossProviderInferenceEngine initialization based on selected model by elronbandel in https://github.com/IBM/unitxt/pull/1539
* Fix recently broken rag metrics by elronbandel in https://github.com/IBM/unitxt/pull/1554
* Finqa hash to top by elronbandel in https://github.com/IBM/unitxt/pull/1555
* Refactor safety metric to be faster and updated by elronbandel in https://github.com/IBM/unitxt/pull/1484
* Improve assistant by elronbandel in https://github.com/IBM/unitxt/pull/1556

New Contributors
* tejaswini made their first contribution in https://github.com/IBM/unitxt/pull/1532

**Full Changelog**: https://github.com/IBM/unitxt/compare/1.17.0...1.17.1

Page 1 of 10

Releases

Has known vulnerabilities

Unitxt

Page 1 of 10

1.21.0

1.20.0

1.19.0

1.18.0

1.17.2

1.17.1

Page 1 of 10

Links

Releases