Argilla

Latest version: v2.8.0

Safety actively analyzes 723650 Python packages for vulnerabilities to keep your Python projects secure.

Page 5 of 22

1.24.0

>[!Note]
> This release does not contain any new features, but it includes a major change in the argilla server.
> The package is using the `argilla-server` dependency defined [here](https://github.com/argilla-io/argilla-server).

- The [Argilla Changelog](https://github.com/argilla-io/argilla/blob/v1.24.0/CHANGELOG.md)
- The [Argilla Server Changelog](https://github.com/argilla-io/argilla-server/blob/v1.24.0/CHANGELOG.md)

**Full Changelog**: https://github.com/argilla-io/argilla/compare/v1.23.1...v1.24.0

1.23.1

[1.23.1](https://github.com/argilla-io/argilla/compare/v1.23.0...v1.23.1)

Fixed

- Fixed Responsive view for Feedback Datasets. ([4579](https://github.com/argilla-io/argilla/pull/4579))

New Contributors
* CpHaddock made their first contribution at https://github.com/argilla-io/argilla/pull/4484
* julien-c made their first contribution in https://github.com/argilla-io/argilla/pull/4582

**Full Changelog**: https://github.com/argilla-io/argilla/compare/v1.23.0...v1.23.1

1.23.0

🔆 Release highlights

Hugging Face OAuth

You can now set up OAuth in your Argilla Hugging Face spaces. This is a simple way to have your team members or collaborators in crowdsourced projects sign in and log in to your space using their Hugging face accounts.

To learn how to set up Hugging Face OAuth for your Argilla Space, go to [our docs](https://docs.argilla.io/en/latest/getting_started/installation/deployments/huggingface-spaces.html).

Bulk actions for filter results

We’ve added an improvement for our bulk view so you can perform actions on all results from a filter (or a combination of them!).

To use this, go to the bulk view and apply some filter(s) of your choice. If the results are more than the records seen in the current page, when you click the checkbox you will see the option to select all of the results. Then, you can give responses, discard, save a draft and even submit all of the records at once!

![Screenshot of the UI with the bulk selector for all filter results](https://github.com/argilla-io/argilla/assets/126158523/3f287185-5177-4b75-9d7b-160e3947ccbf)

Embed PDFs in a TextField

We’ve added the `pdf_to_html` function in our utilities so you can easily embed a PDF reader within a TextField using markdown.

This function accepts either the **file path**, the **URLs** or the **file's byte data** and returns the corresponding HTML to render the PDF within the Argilla user interface.

Learn more about how to use this feature [here](https://docs.argilla.io/en/v1.23.0/tutorials_and_integrations/tutorials/feedback/making-most-of-markdown.html#Inspecting-PDFs).

[Changelog 1.23.0](https://github.com/argilla-io/argilla/compare/v1.22.0...v1.23.0)

Added

- Added bulk annotation by filter criteria. ([4516](https://github.com/argilla-io/argilla/pull/4516))
- Automatically fetch new datasets on focus tab. ([4514](https://github.com/argilla-io/argilla/pull/4514))
- API v1 responses returning `Record` schema now always include `dataset_id` as attribute. ([4482](https://github.com/argilla-io/argilla/pull/4482))
- API v1 responses returning `Response` schema now always include `record_id` as attribute. ([4482](https://github.com/argilla-io/argilla/pull/4482))
- API v1 responses returning `Question` schema now always include `dataset_id` attribute. ([4487](https://github.com/argilla-io/argilla/pull/4487))
- API v1 responses returning `Field` schema now always include `dataset_id` attribute. ([4488](https://github.com/argilla-io/argilla/pull/4488))
- API v1 responses returning `MetadataProperty` schema now always include `dataset_id` attribute. ([4489](https://github.com/argilla-io/argilla/pull/4489))
- API v1 responses returning `VectorSettings` schema now always include `dataset_id` attribute. ([4490](https://github.com/argilla-io/argilla/pull/4490))
- Added `pdf_to_html` function to `.html_utils` module that convert PDFs to dataURL to be able to render them in tha Argilla UI. ([4481](https://github.com/argilla-io/argilla/issues/4481#issuecomment-1903695755))
- Added `ARGILLA_AUTH_SECRET_KEY` environment variable. ([4539](https://github.com/argilla-io/argilla/pull/4539))
- Added `ARGILLA_AUTH_ALGORITHM` environment variable. ([4539](https://github.com/argilla-io/argilla/pull/4539))
- Added `ARGILLA_AUTH_TOKEN_EXPIRATION` environment variable. ([4539](https://github.com/argilla-io/argilla/pull/4539))
- Added `ARGILLA_AUTH_OAUTH_CFG` environment variable. ([4546](https://github.com/argilla-io/argilla/pull/4546))
- Added OAuth2 support for HuggingFace Hub. ([4546](https://github.com/argilla-io/argilla/pull/4546))

Deprecated

- Deprecated `ARGILLA_LOCAL_AUTH_*` environment variables. Will be removed in the release v1.25.0. ([4539](https://github.com/argilla-io/argilla/pull/4539))

Changed

- Changed regex pattern for `username` attribute in `UserCreate`. Now uppercase letters are allowed. ([4544](https://github.com/argilla-io/argilla/pull/4544))

Removed

- Remove sending `Authorization` header from python SDK requests. ([4535](https://github.com/argilla-io/argilla/pull/4535))

Fixed

- Fixed keyboard shortcut for label questions. ([4530](https://github.com/argilla-io/argilla/pull/4530))

New Contributors

* gardner made their first contribution in https://github.com/argilla-io/argilla/pull/4527

**Full Changelog**: https://github.com/argilla-io/argilla/compare/v1.22.0...v1.23.0

1.22.0

🔆 Release Highlights

Bulk actions in Feedback Task datasets
Our signature bulk actions are now available for Feedback datasets!

https://user-images.githubusercontent.com/126158523/297772506-97d83a54-ea3f-4700-acd6-ff9e349ade63.mp4

Switch between *Focus* and *Bulk* depending on your needs:

- In the *Focus* view, you can navigate and respond to records individually. This is ideal for closely examining and giving responses to each record.
- The *Bulk* view allows you to see multiple records on the same page. You can select all or some of them and perform actions in bulk, such as applying a label, saving responses, submitting, or discarding. You can use this feature along with filters and similarity search to process a list of records in bulk.

For now, this is only available in the *Pending* queue, but rest assured, bulk actions will be improved and extended to other queues in upcoming releases.

Read more about our *Focus* and *Bulk* views [here](https://docs.argilla.io/en/latest/practical_guides/annotate_dataset.html#focus-vs-bulk-view).

Sorting rating values

We now support sorting records in the Argilla UI based on the values of Rating questions (both suggestions and responses):
![Screenshot of the sorting by Rating question value options](https://user-images.githubusercontent.com/126158523/297764458-5204a09d-7060-4ff7-83f1-93b7acf5d74b.png)

Learn about this and other filters [in our docs](https://docs.argilla.io/en/latest/practical_guides/filter_dataset.html#feedback-dataset).

Out-of-the-box embedding support

It’s now easier than ever to add vector embeddings to your records with the new Sentence Transformers integration.

Just choose a model from the Hugging Face hub and use our `SentenceTransformersExtractor` to add vectors to your dataset:

python
import argilla as rg
from argilla.client.feedback.integrations.sentencetransformers import SentenceTransformersExtractor

Connect to Argilla
rg.init(
api_url="http://localhost:6900",
api_key="owner.apikey",
workspace="my_workspace"
)

Initialize the SentenceTransformersExtractor
ste = SentenceTransformersExtractor(
model = "TaylorAI/bge-micro-v2", Use a model from https://huggingface.co/models?library=sentence-transformers
show_progress = False,
)

Load a dataset from your Argilla instance
ds_remote = rg.FeedbackDataset.from_argilla("my_dataset")

Update the dataset
ste.update_dataset(
dataset=ds_remote,
fields=["context"], Only update the context field
update_records=True, Update the records in the dataset
overwrite=False, Overwrite existing fields
)

Learn more about this functionality in [this tutorial](https://docs.argilla.io/en/latest/tutorials_and_integrations/integrations/add_sentence_transformers_embeddings_as_vectors.html).

[Changelog 1.22.0](https://github.com/argilla-io/argilla/compare/v1.21.0...v1.22.0)

Added

- Added Bulk annotation support. ([4333](https://github.com/argilla-io/argilla/pull/4333))
- Restore filters from feedback dataset settings. ([4461](https://github.com/argilla-io/argilla/pull/4461))
- Warning on feedback dataset settings when leaving page with unsaved changes. ([4461](https://github.com/argilla-io/argilla/pull/4461))
- Added pydantic v2 support using the python SDK. ([4459](https://github.com/argilla-io/argilla/pull/4459))
- Added `vector_settings` to the `__repr__` method of the `FeedbackDataset` and `RemoteFeedbackDataset`. ([4454](https://github.com/argilla-io/argilla/pull/4454))
- Added integration for `sentence-transformers` using `SentenceTransformersExtractor` to configure `vector_settings` in `FeedbackDataset` and `FeedbackRecord`. ([4454](https://github.com/argilla-io/argilla/pull/4454))

Changed

- Module `argilla.cli.server` definitions have been moved to `argilla.server.cli` module. ([4472](https://github.com/argilla-io/argilla/pull/4472))
- [breaking] Changed `vector_settings_by_name` for generic `property_by_name` usage, which will return `None` instead of raising an error. ([4454](https://github.com/argilla-io/argilla/pull/4454))
- The constant definition `ES_INDEX_REGEX_PATTERN` in module `argilla._constants` is now private. ([4472](https://github.com/argilla-io/argilla/pull/4474))
- `nan` values in metadata properties will raise a 422 error when creating/updating records. ([4300](https://github.com/argilla-io/argilla/issues/4300))
- `None` values are now allowed in metadata properties. ([4300](https://github.com/argilla-io/argilla/issues/4300))

Fixed

- Paginating to a new record, automatically scrolls down to selected form area. ([4333](https://github.com/argilla-io/argilla/pull/4333))

Deprecated

- The `missing` response status for filtering records is deprecated and will be removed in the release v1.24.0. Use `pending` instead. ([4433](https://github.com/argilla-io/argilla/pull/4433))

Removed

- The deprecated `python -m argilla database` command has been removed. ([4472](https://github.com/argilla-io/argilla/pull/4472))

New Contributors
* Piyush-Kumar-Ghosh made their first contribution in https://github.com/argilla-io/argilla/pull/4463

**Full Changelog**: https://github.com/argilla-io/argilla/compare/v1.21.0...v1.22.0

1.21.0

🔆 Release highlights

Draft queue

We’ve added a new queue in the Feedback Task UI so that you can save your drafts and have them all together in a separate view. This allows you to save your responses and come back to them before submission.

Note that responses won’t be autosaved now and to save your changes you will need to click on “Save as draft” or use the shortcut `command ⌘` + `S` (macOS), `Ctrl` + `S` (other).

Improved shortcuts

We’ve been working to improve the keyboard shortcuts within the Feedback Task UI to make them more productive and user-friendly.

You can now select labels in Label and Multi-label questions using the numerical keys in your keyboard. To know which number corresponds with each label you can simply show or hide helpers by pressing `command ⌘` (MacOS) or `Ctrl` (other) for 2 seconds. You will then see the numbers next to the corresponding labels.

We’ve also simplified shortcuts for navigation and actions, so that they use as few keys as possible.

Check all available shortcuts [here](https://docs.argilla.io/en/latest/practical_guides/annotate_dataset.html#shortcuts).

New `metrics` module
We've added a new module to analyze the annotations, both in terms of agreement between the annotators and in terms of data and model drift monitoring.

Agreement metrics
Easily measure the inter-annotator agreement to explore the quality of the annotation guidelines and consistency between annotators:

python
import argilla as rg
from argilla.client.feedback.metrics import AgreementMetric
feedback_dataset = rg.FeedbackDataset.from_argilla("...", workspace="...")
metric = AgreementMetric(dataset=feedback_dataset, question_name="question_name")
agreement_metrics = metric.compute("alpha")
>>> agreement_metrics
[AgreementMetricResult(metric_name='alpha', count=1000, result=0.467889)]

Read more [here](https://docs.argilla.io/en/latest/practical_guides/collect_responses.html#agreement-metrics).

Model metrics
You can use `ModelMetric` to model monitor performance for data and model drift:

python
import argilla as rg
from argilla.client.feedback.metrics import ModelMetric
feedback_dataset = rg.FeedbackDataset.from_argilla("...", workspace="...")
metric = ModelMetric(dataset=feedback_dataset, question_name="question_name")
annotator_metrics = metric.compute("accuracy")
>>> annotator_metrics
{'00000000-0000-0000-0000-000000000001': [ModelMetricResult(metric_name='accuracy', count=3, result=0.5)], '00000000-0000-0000-0000-000000000002': [ModelMetricResult(metric_name='accuracy', count=3, result=0.25)], '00000000-0000-0000-0000-000000000003': [ModelMetricResult(metric_name='accuracy', count=3, result=0.5)]}

Read more [here](https://docs.argilla.io/en/latest/practical_guides/collect_responses.html#model-metrics).

List aggregation support for `TermsMetadataProperty`

You can now pass a list of terms within a record’s metadata that will be aggregated and filterable as part of a `TermsMetadataProperty`.

Here is an example:

python
import argilla as rg

dataset = rg.FeedbackDataset(
fields = ...,
questions = ...,
metadata_properties = [rg.TermsMetadataProperty(name="annotators")]
)

record = rg.FeedbackRecord(
fields = ...,
metadata = {"annotators": ["user_1", "user_2"]}
)

Reindex from CLI

Reindex all entities in your Argilla instance (datasets, records, responses, etc.) with a simple CLI command.

bash
argilla server reindex

This is useful when you are working with an existing feedback datasets and you want to update the search engine info.

[Changelog 1.21.0](https://github.com/argilla-io/argilla/compare/v1.20.0...v1.21.0)

Added

- Added new draft queue for annotation view ([4334](https://github.com/argilla-io/argilla/pull/4334))
- Added annotation metrics module for the `FeedbackDataset` (`argilla.client.feedback.metrics`). ([4175](https://github.com/argilla-io/argilla/pull/4175)).
- Added strategy to handle and translate errors from the server for `401` HTTP status code` ([4362](https://github.com/argilla-io/argilla/pull/4362))
- Added integration for `textdescriptives` using `TextDescriptivesExtractor` to configure `metadata_properties` in `FeedbackDataset` and `FeedbackRecord`. ([4400](https://github.com/argilla-io/argilla/pull/4400)). Contributed by m-newhauser
- Added `POST /api/v1/me/responses/bulk` endpoint to create responses in bulk for current user. ([4380](https://github.com/argilla-io/argilla/pull/4380))
- Added list support for term metadata properties. (Closes [4359](https://github.com/argilla-io/argilla/issues/4359))
- Added new CLI task to reindex datasets and records into the search engine. ([4404](https://github.com/argilla-io/argilla/pull/4404))
- Added `httpx_extra_kwargs` argument to `rg.init` and `Argilla` to allow passing extra arguments to `httpx.Client` used by `Argilla`. ([4440](https://github.com/argilla-io/argilla/pull/4441))

Changed

- More productive and simpler shortcuts system ([4215](https://github.com/argilla-io/argilla/pull/4215))
- Move `ArgillaSingleton`, `init` and `active_client` to a new module `singleton`. ([4347](https://github.com/argilla-io/argilla/pull/4347))
- Updated `argilla.load` functions to also work with `FeedbackDataset`s. ([4347](https://github.com/argilla-io/argilla/pull/4347))
- [breaking] Updated `argilla.delete` functions to also work with `FeedbackDataset`s. It now raises an error if the dataset does not exist. ([4347](https://github.com/argilla-io/argilla/pull/4347))
- Updated `argilla.list_datasets` functions to also work with `FeedbackDataset`s. ([4347](https://github.com/argilla-io/argilla/pull/4347))

Fixed

- Fixed error in `TextClassificationSettings.from_dict` method in which the `label_schema` created was a list of `dict` instead of a list of `str`. ([4347](https://github.com/argilla-io/argilla/pull/4347))
- Fixed total records on pagination component ([4424](https://github.com/argilla-io/argilla/pull/4424))

Removed

- Removed `draft` auto save for annotation view ([4334](https://github.com/argilla-io/argilla/pull/4334))

1.20.0

Added

- Added `GET /api/v1/datasets/:dataset_id/records/search/suggestions/options` endpoint to return suggestion available options for searching. ([4260](https://github.com/argilla-io/argilla/pull/4260))
- Added `metadata_properties` to the `__repr__` method of the `FeedbackDataset` and `RemoteFeedbackDataset`.([4192](https://github.com/argilla-io/argilla/pull/4192)).
- Added `get_model_kwargs`, `get_trainer_kwargs`, `get_trainer_model`, `get_trainer_tokenizer` and `get_trainer` -methods to the `ArgillaTrainer` to improve interoperability across frameworks. ([4214](https://github.com/argilla-io/argilla/pull/4214)).
- Added additional formatting checks to the `ArgillaTrainer` to allow for better interoperability of `defaults` and `formatting_func` usage. ([4214](https://github.com/argilla-io/argilla/pull/4214)).
- Added a warning to the `update_config`-method of `ArgillaTrainer` to emphasize if the `kwargs` were updated correctly. ([4214](https://github.com/argilla-io/argilla/pull/4214)).
- Added `argilla.client.feedback.utils` module with `html_utils` (this mainly includes `video/audio/image_to_html` that convert media to dataURL to be able to render them in tha Argilla UI and `create_token_highlights` to highlight tokens in a custom way. Both work on TextQuestion and TextField with use_markdown=True) and `assignments` (this mainly includes `assign_records` to assign records according to a number of annotators and records, an overlap and the shuffle option; and `assign_workspace` to assign and create if needed a workspace according to the record assignment). ([4121](https://github.com/argilla-io/argilla/pull/4121))

Fixed

- Fixed error in `ArgillaTrainer`, with numerical labels, using `RatingQuestion` instead of `RankingQuestion` ([4171](https://github.com/argilla-io/argilla/pull/4171))
- Fixed error in `ArgillaTrainer`, now we can train for `extractive_question_answering` using a validation sample ([4204](https://github.com/argilla-io/argilla/pull/4204))
- Fixed error in `ArgillaTrainer`, when training for `sentence-similarity` it didn't work with a list of values per record ([4211](https://github.com/argilla-io/argilla/pull/4211))
- Fixed error in the unification strategy for `RankingQuestion` ([4295](https://github.com/argilla-io/argilla/pull/4295))
- Fixed `TextClassificationSettings.labels_schema` order was not being preserved. Closes [3828](https://github.com/argilla-io/argilla/issues/3828) ([#4332](https://github.com/argilla-io/argilla/pull/4332))
- Fixed error when requesting non-existing API endpoints. Closes [4073](https://github.com/argilla-io/argilla/issues/4073) ([#4325](https://github.com/argilla-io/argilla/pull/4325))
- Fixed error when passing `draft` responses to create records endpoint. ([4354](https://github.com/argilla-io/argilla/pull/4354))

Changed

- [breaking] Suggestions `agent` field only accepts now some specific characters and a limited length. ([4265](https://github.com/argilla-io/argilla/pull/4265))
- [breaking] Suggestions `score` field only accepts now float values in the range `0` to `1`. ([4266](https://github.com/argilla-io/argilla/pull/4266))
- Updated `POST /api/v1/dataset/:dataset_id/records/search` endpoint to support optional `query` attribute. ([4327](https://github.com/argilla-io/argilla/pull/4327))
- Updated `POST /api/v1/dataset/:dataset_id/records/search` endpoint to support `filter` and `sort` attributes. ([4327](https://github.com/argilla-io/argilla/pull/4327))
- Updated `POST /api/v1/me/datasets/:dataset_id/records/search` endpoint to support optional `query` attribute. ([4270](https://github.com/argilla-io/argilla/pull/4270))
- Updated `POST /api/v1/me/datasets/:dataset_id/records/search` endpoint to support `filter` and `sort` attributes. ([4270](https://github.com/argilla-io/argilla/pull/4270))
- Changed the logging style while pulling and pushing `FeedbackDataset` to Argilla from `tqdm` style to `rich`. ([4267](https://github.com/argilla-io/argilla/pull/4267)). Contributed by zucchini-nlp.
- Updated `push_to_argilla` to print `repr` of the pushed `RemoteFeedbackDataset` after push and changed `show_progress` to True by default. ([4223](https://github.com/argilla-io/argilla/pull/4223))
- Changed `models` and `tokenizer` for the `ArgillaTrainer` to explicitly allow for changing them when needed. ([4214](https://github.com/argilla-io/argilla/pull/4214)).

Page 5 of 22

Releases

Has known vulnerabilities

Previous Next

Argilla

Page 5 of 22

1.24.0

1.23.1

1.23.0

1.22.0

1.21.0

1.20.0

Page 5 of 22

Links

Releases