Unitxt

Latest version: v1.20.0

Safety actively analyzes 714772 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 10

1.20.0

What's Changed
* Fix unnecessary attempts in LoadCSV by elronbandel in https://github.com/IBM/unitxt/pull/1630
* Fix LLM as Judge direct criteria typo by martinscooper in https://github.com/IBM/unitxt/pull/1631
* Fix of typo in usage of attributes inside IntersectCorrespondingFields by pklpriv in https://github.com/IBM/unitxt/pull/1637
* Added MILU and Indic BoolQ Support by murthyrudra in https://github.com/IBM/unitxt/pull/1639
* Vision bench by alfassy in https://github.com/IBM/unitxt/pull/1641
* Add Granite Guardian evaluation on HF example by martinscooper in https://github.com/IBM/unitxt/pull/1638
* present catalog entries as pieces of python code by dafnapension in https://github.com/IBM/unitxt/pull/1643
* Example for evaluating system message leakage by elronbandel in https://github.com/IBM/unitxt/pull/1609
* Benjams/add hotpotqa + change type of metadata field to dict (non backward compatible) by BenjSz in https://github.com/IBM/unitxt/pull/1633
* removed the leftout break_point by dafnapension in https://github.com/IBM/unitxt/pull/1646
* Added Indic ARC Challenge Support by murthyrudra in https://github.com/IBM/unitxt/pull/1654
* Minor bug fix affecting Text2SQL execution accuracy by oktie in https://github.com/IBM/unitxt/pull/1657
* WMLInferenceEngineChat fixes by pawelknes in https://github.com/IBM/unitxt/pull/1656
* Update version to 1.20.0 by elronbandel in https://github.com/IBM/unitxt/pull/1659

New Contributors
* murthyrudra made their first contribution in https://github.com/IBM/unitxt/pull/1639

**Full Changelog**: https://github.com/IBM/unitxt/compare/1.19.0...1.20.0

1.19.0

What's Changed
* Add RagBench datasets by elronbandel in https://github.com/IBM/unitxt/pull/1580
* Fix prompts table benchmark by ShirApp in https://github.com/IBM/unitxt/pull/1581
* Fix attempt to missing arrow dataset by elronbandel in https://github.com/IBM/unitxt/pull/1582
* Wml comp by alfassy in https://github.com/IBM/unitxt/pull/1578
* Key value extraction improvements by yoavkatz in https://github.com/IBM/unitxt/pull/1573
* fix: minor bug when only space id is provided for WML inference by tsinggggg in https://github.com/IBM/unitxt/pull/1583
* Try fixing csv loader by elronbandel in https://github.com/IBM/unitxt/pull/1586
* Fix failing tests by elronbandel in https://github.com/IBM/unitxt/pull/1589
* Fix tests by elronbandel in https://github.com/IBM/unitxt/pull/1590
* Fix metrics formatting and style by elronbandel in https://github.com/IBM/unitxt/pull/1591
* Fix bird dataset by perlitz in https://github.com/IBM/unitxt/pull/1593
* Use Lazy Loaders by dafnapension in https://github.com/IBM/unitxt/pull/1536
* Fix loading without limit by elronbandel in https://github.com/IBM/unitxt/pull/1594
* [Breaking change] Add support for all Granite Guardian risks by martinscooper in https://github.com/IBM/unitxt/pull/1576
* Added api call example by yoavkatz in https://github.com/IBM/unitxt/pull/1587
* Make MultipleSourceLoader lazy and fix its use of fusion by elronbandel in https://github.com/IBM/unitxt/pull/1602
* Prioritize using default templates from card over task by elronbandel in https://github.com/IBM/unitxt/pull/1596
* Use faster model for examples by elronbandel in https://github.com/IBM/unitxt/pull/1607
* Add clear and minimal settings documentation by elronbandel in https://github.com/IBM/unitxt/pull/1606
* Fix some tests by elronbandel in https://github.com/IBM/unitxt/pull/1610
* Add download and etag timeout settings to workflow configurations by elronbandel in https://github.com/IBM/unitxt/pull/1613
* Allow read timeout error in preparation tests by elronbandel in https://github.com/IBM/unitxt/pull/1615
* Fix Ollama inference engine by eladven in https://github.com/IBM/unitxt/pull/1611
* Add verify as an option to LoadFromAPI by perlitz in https://github.com/IBM/unitxt/pull/1608
* Added example of custom metric by yoavkatz in https://github.com/IBM/unitxt/pull/1616
* Granite guardian minor changes by martinscooper in https://github.com/IBM/unitxt/pull/1605
* add ragbench faithfulness cards by lilacheden in https://github.com/IBM/unitxt/pull/1598
* Update tables benchmark name to torr by elronbandel in https://github.com/IBM/unitxt/pull/1617
* Add CoT to LLM as judge assessments by martinscooper in https://github.com/IBM/unitxt/pull/1612
* Simplify preparation tests with better error handling by elronbandel in https://github.com/IBM/unitxt/pull/1618
* Text2sql execution accuracy metric updates by oktie in https://github.com/IBM/unitxt/pull/1604
* Fix Azure OpenAI based LLM judges by martinscooper in https://github.com/IBM/unitxt/pull/1619
* Add correctness_based_on_ground_truth criteria by martinscooper in https://github.com/IBM/unitxt/pull/1623
* Enable offline mode for hugginface by using local pre-downloaded metrics, datasets and models by elronbandel in https://github.com/IBM/unitxt/pull/1603
* Add provider specific args and allow using unrecognized model names by elronbandel in https://github.com/IBM/unitxt/pull/1621
* Start implementing assesment for unitxt assitant by eladven in https://github.com/IBM/unitxt/pull/1625
* small changes to profiler by dafnapension in https://github.com/IBM/unitxt/pull/1627
* Return MultiStream in lazy loaders to avoid copying by elronbandel in https://github.com/IBM/unitxt/pull/1628

New Contributors
* tsinggggg made their first contribution in https://github.com/IBM/unitxt/pull/1583

**Full Changelog**: https://github.com/IBM/unitxt/compare/1.18.0...1.19.0

1.18.0

The main improvements in this version focus on **caching strategies, dataset loading, and speed optimizations**.

Hugging Face Datasets Caching Policy

We have **completely revised our caching policy** and how we handle Hugging Face datasets in order to improve performance.

1. **Hugging Face datasets are now cached by default.**

This means that LoadHF loader will cache the downloaded datasets in the HF cache directory (typically ~/.cache/huggingface/datasets).

- To disable this caching mechanism, use:
python
unitxt.settings.disable_hf_datasets_cache = True


2. **All Hugging Face datasets are first downloaded and then processed.**

- This means the entire dataset is downloaded, which is faster for most datasets. However, if you want to process a huge dataset, and the HF dataset supports streaming, you can load it **in streaming mode**

python
LoadHF(name="my-dataset", streaming=True)


3. **To enable streaming mode by default for all Hugging Face datasets, use:**
python
unitxt.settings.stream_hf_datasets_by_default = True


While the **new defaults (full download & caching)** may make the **initial dataset load slower**, subsequent loads will be **significantly faster**.

Unitxt Datasets Caching Policy

By default, when loading datasets with `unitxt.load_dataset`, the dataset is **prepared from scratch** each time you call the function.
This ensures that any changes made to the card definition are reflected in the output.

- This process may take a few seconds, and for **large datasets**, repeated loading can accumulate overhead.
- If you are using fixed datasets from the catalog, you can **enable caching** for Unitxt datasets and thus cache the unitxt datasets.
The datasets are cached in the huggingface cache (typically ~/.cache/huggingface/datasets).

python
from unitxt import load_dataset

ds = load_dataset(card="my_card", use_cache=True)


Faster Unitxt Dataset Preparation

To improve dataset loading speed, we have optimized how Unitxt datasets are prepared.

Background:
Unitxt datasets are converted to Hugging Face datasets because they store data on disk while keeping only the necessary parts in memory (via **PyArrow**). This enables efficient handling of large datasets without excessive memory usage.

Previously, `unitxt.load_dataset` used **built-in Hugging Face methods** for dataset preparation, which included **unnecessary type handling and verification**, slowing down the process.

Key improvements:
- We now **create the Hugging Face dataset directly**, reducing preparation time by **almost 50%**.
- With this optimization, **Unitxt datasets are now faster than ever!**

What's Changed
* End of year summary blog post by elronbandel in https://github.com/IBM/unitxt/pull/1530
* Updated documentation and examples of LLM-as-Judge by tejaswini in https://github.com/IBM/unitxt/pull/1532
* Eval assist documentation by tejaswini in https://github.com/IBM/unitxt/pull/1537
* Update notification banner styles and add 2024 summary blog link by elronbandel in https://github.com/IBM/unitxt/pull/1538
* Add more granite llm as judge artifacts by martinscooper in https://github.com/IBM/unitxt/pull/1516
* Fix Australian legal qa dataset by elronbandel in https://github.com/IBM/unitxt/pull/1542
* Set use 1 shot for wikitq in tables_benchmark by yifanmai in https://github.com/IBM/unitxt/pull/1541
* Bugfix: indexed row major serialization fails with None cell values by yifanmai in https://github.com/IBM/unitxt/pull/1540
* Solve issue of expired token in Unitxt Assistant by eladven in https://github.com/IBM/unitxt/pull/1543
* Add Replicate inference support by elronbandel in https://github.com/IBM/unitxt/pull/1544
* add a filter to wikitq by ShirApp in https://github.com/IBM/unitxt/pull/1547
* Add text2sql tasks by perlitz in https://github.com/IBM/unitxt/pull/1414
* Add deduplicate operator by elronbandel in https://github.com/IBM/unitxt/pull/1549
* Fix the authentication problem by eladven in https://github.com/IBM/unitxt/pull/1550
* Attach assitant answers to their origins with url link by elronbandel in https://github.com/IBM/unitxt/pull/1528
* Add mtrag benchmark by elronbandel in https://github.com/IBM/unitxt/pull/1548
* Update end of year summary blog by elronbandel in https://github.com/IBM/unitxt/pull/1552
* Add data classification policy to CrossProviderInferenceEngine initialization based on selected model by elronbandel in https://github.com/IBM/unitxt/pull/1539
* Fix recently broken rag metrics by elronbandel in https://github.com/IBM/unitxt/pull/1554
* Renamed criterias in LLM-as-a-Judge metrics to criteria - Breaking change by tejaswini in https://github.com/IBM/unitxt/pull/1545
* Finqa hash to top by elronbandel in https://github.com/IBM/unitxt/pull/1555
* Refactor safety metric to be faster and updated by elronbandel in https://github.com/IBM/unitxt/pull/1484
* Improve assistant by elronbandel in https://github.com/IBM/unitxt/pull/1556
* Feature/add global mmlu cards by eliyahabba in https://github.com/IBM/unitxt/pull/1561
* Add quality dataset by eliyahabba in https://github.com/IBM/unitxt/pull/1563
* Add CollateInstanceByField operator to group data by specific field by sarathsgvr in https://github.com/IBM/unitxt/pull/1546
* Fix prompts table benchmark by ShirApp in https://github.com/IBM/unitxt/pull/1565
* Create new IntersectCorrespondingFields operator by pklpriv in https://github.com/IBM/unitxt/pull/1531
* Add granite documents format by elronbandel in https://github.com/IBM/unitxt/pull/1566
* Revisit huggingface cache policy - BREAKING CHANGE by elronbandel in https://github.com/IBM/unitxt/pull/1564
* Add global mmlu lite sensitivity cards by eliyahabba in https://github.com/IBM/unitxt/pull/1568
* Add schema-linking by KyleErwin in https://github.com/IBM/unitxt/pull/1533
* fix the printout of empty strings in the yaml cards of the catalog by dafnapension in https://github.com/IBM/unitxt/pull/1567
* Use repr instead of to_json for unitxt dataset caching by elronbandel in https://github.com/IBM/unitxt/pull/1570
* Added key value extraction evaluation and example with images by yoavkatz in https://github.com/IBM/unitxt/pull/1529

New Contributors
* tejaswini made their first contribution in https://github.com/IBM/unitxt/pull/1532
* KyleErwin made their first contribution in https://github.com/IBM/unitxt/pull/1533

**Full Changelog**: https://github.com/IBM/unitxt/compare/1.17.0...1.18.0

1.17.2

What's Changed
* Feature/add global mmlu cards by eliyahabba in https://github.com/IBM/unitxt/pull/1561
* Add quality dataset by eliyahabba in https://github.com/IBM/unitxt/pull/1563
* Add CollateInstanceByField operator to group data by specific field by sarathsgvr in https://github.com/IBM/unitxt/pull/1546
* Fix prompts table benchmark by ShirApp in https://github.com/IBM/unitxt/pull/1565
* Create new IntersectCorrespondingFields operator by pklpriv in https://github.com/IBM/unitxt/pull/1531
* Add granite documents format by elronbandel in https://github.com/IBM/unitxt/pull/1566
* Revisit huggingface cache policy by elronbandel in https://github.com/IBM/unitxt/pull/1564
* Add global mmlu lite sensitivity cards by eliyahabba in https://github.com/IBM/unitxt/pull/1568
* Update version to 1.17.2 by elronbandel in https://github.com/IBM/unitxt/pull/1569


**Full Changelog**: https://github.com/IBM/unitxt/compare/1.17.1...1.17.2

1.17.1

What's Changed

Non backward compatible change

* Renamed criterias in LLM-as-a-Judge metrics to criteria - Breaking change by tejaswini in https://github.com/IBM/unitxt/pull/1545

New features

* Add Replicate inference support by elronbandel in https://github.com/IBM/unitxt/pull/1544
* Add text2sql tasks by perlitz in https://github.com/IBM/unitxt/pull/1414
* Add deduplicate operator by elronbandel in https://github.com/IBM/unitxt/pull/1549

New Assets
* Add more granite llm as judge artifacts by martinscooper in https://github.com/IBM/unitxt/pull/1516
* Add mtrag benchmark by elronbandel in https://github.com/IBM/unitxt/pull/1548

Documentation
* End of year summary blog post by elronbandel in https://github.com/IBM/unitxt/pull/1530
* Update notification banner styles and add 2024 summary blog link by elronbandel in https://github.com/IBM/unitxt/pull/1538
* Updated documentation and examples of LLM-as-Judge by tejaswini in https://github.com/IBM/unitxt/pull/1532
* Eval assist documentation by tejaswini in https://github.com/IBM/unitxt/pull/1537


Bug Fixes
* Fix Australian legal qa dataset by elronbandel in https://github.com/IBM/unitxt/pull/1542
* Set use 1 shot for wikitq in tables_benchmark by yifanmai in https://github.com/IBM/unitxt/pull/1541
* Bugfix: indexed row major serialization fails with None cell values by yifanmai in https://github.com/IBM/unitxt/pull/1540
* Solve issue of expired token in Unitxt Assistant by eladven in https://github.com/IBM/unitxt/pull/1543
* add a filter to wikitq by ShirApp in https://github.com/IBM/unitxt/pull/1547
* Fix the authentication problem by eladven in https://github.com/IBM/unitxt/pull/1550
* Attach assitant answers to their origins with url link by elronbandel in https://github.com/IBM/unitxt/pull/1528
* Update end of year summary blog by elronbandel in https://github.com/IBM/unitxt/pull/1552
* Add data classification policy to CrossProviderInferenceEngine initialization based on selected model by elronbandel in https://github.com/IBM/unitxt/pull/1539
* Fix recently broken rag metrics by elronbandel in https://github.com/IBM/unitxt/pull/1554
* Finqa hash to top by elronbandel in https://github.com/IBM/unitxt/pull/1555
* Refactor safety metric to be faster and updated by elronbandel in https://github.com/IBM/unitxt/pull/1484
* Improve assistant by elronbandel in https://github.com/IBM/unitxt/pull/1556

New Contributors
* tejaswini made their first contribution in https://github.com/IBM/unitxt/pull/1532

**Full Changelog**: https://github.com/IBM/unitxt/compare/1.17.0...1.17.1

1.17.0

Importnat Changes
write abstract for update talk about unitxt covering the following topics:
- **Criteria based LLM as Judges** - Improved class of llm as judges with customizable judging criteria [(read more)](https://www.unitxt.ai/en/latest/docs/llm_as_judge.html)
- **Unitxt assistant** - A textual assistant expert in unitxt to help developers [(read more)](https://www.unitxt.ai/en/latest/docs/llm_as_judge.html)
- **New benchmarks: Tables, Vision** - Benchmarks for table understanding and image understanding compiled by the community and collaborators [(read more)](https://www.unitxt.ai/en/latest/docs/llm_as_judge.html)
- **Support for all major inference providers** - Inference for evaluation or llm as judges can be channel to any inference provider such as: azure, aws and watsonx [(read more)](https://www.unitxt.ai/en/latest/docs/llm_as_judge.html)

Detailed Changes
* Fix typing notation for python 3.8 by elronbandel in https://github.com/IBM/unitxt/pull/1453
* Instance_metric and apply_metric keep only one instance at a time in mem, at the expense of repeated passes over input stream (2 times for instance_metric, metrics for apply_metric) by dafnapension in https://github.com/IBM/unitxt/pull/1448
* simplify class parameter listing on web page by dafnapension in https://github.com/IBM/unitxt/pull/1454
* Bring code coverage tests back to life by elronbandel in https://github.com/IBM/unitxt/pull/1455
* Fix coverage tests by elronbandel in https://github.com/IBM/unitxt/pull/1456
* make demos_pool a local var rather than a separate stream by dafnapension in https://github.com/IBM/unitxt/pull/1436
* Adding upper case and last non empty line processor by antonpibm in https://github.com/IBM/unitxt/pull/1458
* performance by bluebench by dafnapension in https://github.com/IBM/unitxt/pull/1457
* Add UNITXT_MOCK_INFERENCE_MODE environment variable to performance workflow by elronbandel in https://github.com/IBM/unitxt/pull/1461
* remove redundant lines from performance.yml by dafnapension in https://github.com/IBM/unitxt/pull/1462
* Benjams/add bioasq miniwiki datasets by BenjSz in https://github.com/IBM/unitxt/pull/1460
* Add SocialIQA dataset by elronbandel in https://github.com/IBM/unitxt/pull/1468
* Add parallelization to RITS inference by arielge in https://github.com/IBM/unitxt/pull/1441
* Fix the type handeling for tasks to support string types by elronbandel in https://github.com/IBM/unitxt/pull/1470
* Update version to 1.16.1 by elronbandel in https://github.com/IBM/unitxt/pull/1472
* extend choices arrangement functionality with ReorderableMultipleChoi… by eliyahabba in https://github.com/IBM/unitxt/pull/1464
* Add GPQA dataset by elronbandel in https://github.com/IBM/unitxt/pull/1474
* Add simple QA dataset by elronbandel in https://github.com/IBM/unitxt/pull/1475
* Add LongBench V2 dataset by elronbandel in https://github.com/IBM/unitxt/pull/1476
* Adding typed recipe test by antonpibm in https://github.com/IBM/unitxt/pull/1473
* Add place_correct_choice_position to set the correct choice index and… by eliyahabba in https://github.com/IBM/unitxt/pull/1481
* Add MapReduceMetric a new base class to integrate all metrics into by elronbandel in https://github.com/IBM/unitxt/pull/1459
* Add multi document support and FRAMES benchmark by elronbandel in https://github.com/IBM/unitxt/pull/1477
* Update version to 1.16.2 by elronbandel in https://github.com/IBM/unitxt/pull/1483
* Add Azure support and expand OpenAI model options in inference engine by elronbandel in https://github.com/IBM/unitxt/pull/1485
* Benjams/fix bioasq card by BenjSz in https://github.com/IBM/unitxt/pull/1486
* add separator to csv loader by BenjSz in https://github.com/IBM/unitxt/pull/1488
* Fix bug in metrics loading in tasks by elronbandel in https://github.com/IBM/unitxt/pull/1487
* Update version to 1.16.3 by elronbandel in https://github.com/IBM/unitxt/pull/1489
* Fix bootstrap condition to handle cases with insufficient instances by elronbandel in https://github.com/IBM/unitxt/pull/1490
* Update version to 1.16.4 by elronbandel in https://github.com/IBM/unitxt/pull/1491
* Simplify artifact link [Non Backward Compatible!] by elronbandel in https://github.com/IBM/unitxt/pull/1494
* Added NER example by yoavkatz in https://github.com/IBM/unitxt/pull/1492
* Add example for evaluating tables as images using Unitxt APIs by elronbandel in https://github.com/IBM/unitxt/pull/1495
* Mm updates by alfassy in https://github.com/IBM/unitxt/pull/1465
* Fix wrong saving of artifact initial dict by elronbandel in https://github.com/IBM/unitxt/pull/1499
* Accelerate and improve RAG Metrics by elronbandel in https://github.com/IBM/unitxt/pull/1497
* Make clinc preparation faster by elronbandel in https://github.com/IBM/unitxt/pull/1501
* Fix templates lists in vision cards by elronbandel in https://github.com/IBM/unitxt/pull/1500
* Add vision benchmark example by elronbandel in https://github.com/IBM/unitxt/pull/1502
* Update vis bench by elronbandel in https://github.com/IBM/unitxt/pull/1505
* Add Balance operator by elronbandel in https://github.com/IBM/unitxt/pull/1507
* Fix for demos_pool with images. by elronbandel in https://github.com/IBM/unitxt/pull/1509
* Remove new balance operator and use existing implementation by elronbandel in https://github.com/IBM/unitxt/pull/1510
* Fixes and adjustment in rag metrics and related inference engines by lilacheden in https://github.com/IBM/unitxt/pull/1466
* Tables bench by ShirApp in https://github.com/IBM/unitxt/pull/1506
* Keep metadata over main unitxt stages by eladven in https://github.com/IBM/unitxt/pull/1512
* Fix: Improved handling of `place_correct_choice_position` for flexibl… by eliyahabba in https://github.com/IBM/unitxt/pull/1511
* Fixes in LLMJudge by lilacheden in https://github.com/IBM/unitxt/pull/1498
* Verify metrics prediction_type without loading metric by elronbandel in https://github.com/IBM/unitxt/pull/1519
* Add Unitxt Assistant beta by elronbandel in https://github.com/IBM/unitxt/pull/1513
* Ensure fusion do not call streams before use by elronbandel in https://github.com/IBM/unitxt/pull/1518
* Minor llm as judge fix/changes by martinscooper in https://github.com/IBM/unitxt/pull/1467
* Fix: Selected option for supporting negative indexes in place_correct… by eliyahabba in https://github.com/IBM/unitxt/pull/1522
* Refactor rag metrics and judges by lilacheden in https://github.com/IBM/unitxt/pull/1515
* Add Llama 3.1 on Vertex AI to CrossProviderInferenceEngine by yifanmai in https://github.com/IBM/unitxt/pull/1525
* fix external_rag example by lilacheden in https://github.com/IBM/unitxt/pull/1526
* Add search to assistant for much faster response by elronbandel in https://github.com/IBM/unitxt/pull/1524
* fixed division by 0 in compare performance results by dafnapension in https://github.com/IBM/unitxt/pull/1523
* Add two criteria based direct llm judges by lilacheden in https://github.com/IBM/unitxt/pull/1527
* Update version to 1.17.0 by elronbandel in https://github.com/IBM/unitxt/pull/1535

New Contributors
* eliyahabba made their first contribution in https://github.com/IBM/unitxt/pull/1464

**Full Changelog**: https://github.com/IBM/unitxt/compare/1.16.0...1.17.0

Page 1 of 10

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.