Argilla

Latest version: v2.8.0

Safety actively analyzes 723650 Python packages for vulnerabilities to keep your Python projects secure.

Page 2 of 22

2.3.1

What's Changed

This is a patch release fixing an error listing current user datasets:

- Fixed error listing current user datasets and not filtering by current user id. ([5583](https://github.com/argilla-io/argilla/pull/5583))

**Full Changelog**: https://github.com/argilla-io/argilla/compare/v2.3.0...v2.3.1

2.3.0

🌟 Release highlights

Custom Fields: the most powerful way to build custom annotation tasks
We heard you. This new type of field gives you full control over how data is presented to annotators.

With custom fields, you can use your own CSS, HTML, and even Javascript (welcome interactive fields!). Moreover, you can populate your fields with custom structures like `custom_field={"image1": ..., "image_2": ..., etc.}`.

Here's an example:

> Imagine you want to show two images and a prompt to your users.

With a custom field

With the new custom field, you can configure something like this:

<img width="952" alt="Screenshot 2024-10-04 at 13 04 28" src="https://github.com/user-attachments/assets/1e85a5e5-7e35-4912-8f32-aeed4e32adfe">

And you can set this up with a few lines of code:

python
css_template = """
<style>
container {
display: flex;
flex-direction: column;
font-family: Arial, sans-serif;
}
.prompt {
margin-bottom: 10px;
font-size: 16px;
line-height: 1.4;
color: 333;
background-color: f8f8f8;
padding: 10px;
border-radius: 5px;
box-shadow: 0 1px 3px rgba(0,0,0,0.1);
}
.image-container {
display: flex;
gap: 10px;
}
.column {
flex: 1;
position: relative;
}
img {
max-width: 100%;
height: auto;
display: block;
}
.image-label {
position: absolute;
top: 10px;
right: 10px;
background-color: rgba(255, 255, 255, 0.7);
color: black;
padding: 5px 10px;
border-radius: 5px;
font-weight: bold;
}
</style>
"""

html_template = """
<div id="container">
<div class="prompt"><strong>Prompt:</strong> {{record.fields.images.prompt}}</div>
<div class="image-container">
<div class="column">
<img src="{{record.fields.images.image_1}}" />
<div class="image-label">Image 1</div>
</div>
<div class="column">
<img src="{{record.fields.images.image_2}}" />
<div class="image-label">Image 2</div>
</div>
</div>
</div>
"""

custom_field = rg.CustomField(
name="images",
template=css_template + html_template,
)

and the log records like this
rg.Record(
fields={
"prompt": prompt,
"image_1": schnell_uri,
"image_2": dev_uri,
}
)

Before the custom field

Before this release, you were forced to use two `ImageField` and a `TextField`, which would be displayed sequentially, limiting the ability to compare the images side-by-side, with clear labels, prompt text, etc. It would look like this:

<img width="736" alt="Screenshot 2024-10-04 at 14 13 52" src="https://github.com/user-attachments/assets/03ac0a7d-04a6-4f53-96f9-40a070d1c130">

How to get started with custom fields

Here we've shown a basic presentation-oriented custom field but you can set up anything you can think of, leveraging JS, html, and css. Imagination is the limit!

To get started check the docs: https://docs.argilla.io/v2.3/how_to_guides/custom_fields/

Other features
- Support for similarity search [from the SDK](https://docs.argilla.io/latest/how_to_guides/query/#similarity-search) and other search and [filtering improvements](https://docs.argilla.io/latest/how_to_guides/query/#available-fields).
- New Helm chart [deployment configuration](https://github.com/argilla-io/argilla/tree/v2.3.0/examples/deployments/k8s/argilla-chart).
- Support credentials from colab secrets.

An other changes and fixes

Changed

- Changed the __repr__ method for `SettingsProperties` to display the details of all the properties in `Setting` object. ([5380](https://github.com/argilla-io/argilla/issues/5380))
- Changed error messages when creating datasets with insufficient permissions. ([5540](https://github.com/argilla-io/argilla/pull/5554))

Fixed

- Fixed serialization of `ChatField` when collecting records from the hub and exporting to `datasets`. ([5554](https://github.com/argilla-io/argilla/pull/5553))
- Fixed error when creating default user with existing default workspace. ([5558](https://github.com/argilla-io/argilla/pull/5558))
- Fixed the deployment yaml used to create a new Argilla server in K8s. Added `USERNAME` and `PASSWORD` to the environment variables of pod template. ([5434](https://github.com/argilla-io/argilla/issues/5434))
- Fix autofill form on sign-in page [5522](https://github.com/argilla-io/argilla/pull/5522)
- Support copy on clipboard for no secure context [5535](https://github.com/argilla-io/argilla/pull/5535)

New Contributors
* not-lain made their first contribution in https://github.com/argilla-io/argilla/pull/5541

Thanks to
* bikash119 for Helm chart in https://github.com/argilla-io/argilla/pull/5512

**Full Changelog**: https://github.com/argilla-io/argilla/compare/v2.2.2...v2.3.0

2.2.2

What's Changed

This is a patch release with certain fixes to the SDK

Fixed

- Fixed `from_hub` with unsupported column names. ([5524](https://github.com/argilla-io/argilla/pull/5524))
- Fixed `from_hub` with missing dataset `subset` configuration value. ([5524](https://github.com/argilla-io/argilla/pull/5524))

Changed

- Changed `from_hub` to only generate fields not questions for strings in the dataset. ([5524](https://github.com/argilla-io/argilla/pull/5524))

**Full Changelog**: https://github.com/argilla-io/argilla/compare/v2.2.1...v2.2.2

2.2.1

What's Changed

This is a patch release with certain fixes to the SDK:

- Fixed `from_hub` errors when columns names contain uppercase letters. ([5523](https://github.com/argilla-io/argilla/pull/5523))
- Fixed `from_hub` errors when class feature values contains unlabelled values. ([5523](https://github.com/argilla-io/argilla/pull/5523))
- Fixed `from_hub` errors when loading cached datasets. ([5523](https://github.com/argilla-io/argilla/pull/5523))

**Full Changelog**: https://github.com/argilla-io/argilla/compare/v2.2.0...v2.2.1

2.2.0

🌟 Release highlights

> [!IMPORTANT]
> Argilla server `2.2.0` adds support for **background jobs**. These background jobs allow us to run jobs that might take a long time at request time. For this reason we now rely on [Redis](https://redis.io) and [Python RQ](https://python-rq.org) workers.
>
> So to upgrade your Argilla instance to version `2.2.0` you need to have an available Redis server. See the [Redis get-started documentation](https://redis.io/docs/latest/get-started/) for more information or the [Argilla server configuration documentation](https://docs.argilla.io/latest/reference/argilla-server/configuration/).
>
> If you have deployed Argilla server using the docker-compose.yaml, you should download the [docker-compose.yaml](https://github.com/argilla-io/argilla/blob/main/examples/deployments/docker/docker-compose.yaml) file again to bring the latest changes to set Redis and Argilla workers
>
> Workers are needed to process Argilla's background jobs. You can run Argilla workers with the following command:
> sh
> python -m argilla_server worker
>

ChatField: working with text conversations in Argilla

https://github.com/user-attachments/assets/563dd57e-6f99-4b04-9bfa-c930b2a1625c

You can now work with text conversations natively in Argilla using the new `ChatField`. It is especially designed to make it easier to build datasets for conversational Large Language Models (LLMs), displaying conversational data in the form of a chat.

Here's how you can create a dataset with a `ChatField`:
python
import argilla as rg

client = rg.Argilla(api_url="<api_url>", api_key="<api_key>")

settings = rg.Settings(
fields=[rg.ChatField(name="chat")],
questions=[...]
)

dataset = rg.Dataset(
name="chat_dataset",
settings=settings,
workspace="my_workspace",
client=client
)

dataset.create()

record = rg.Record(
fields={
"chat": [
{"role": "user", "content": "Hello World, how are you?"},
{"role": "assistant", "content": "I'm doing great, thank you!"}
]
}
)

dataset.records.log([record])

Read more about how to use this new field type [here](https://docs.argilla.io/latest/how_to_guides/dataset/#fields) and [here](https://docs.argilla.io/dev/how_to_guides/record/#add-records).

Adjust task distribution settings
You can now modify task distribution settings at any time, and Argilla will automatically recalculate the completed and pending records. When you update this setting, records will be removed from or added to the pending queues of your team accordingly.

You can make this change in the dataset settings page or using the SDK:
python
import argilla as rg

client = rg.Argilla(api_url="<api_url>", api_key="<api_key>")

dataset = client.datasets("my_dataset")
dataset.settings.distribution.min_submitted = 2
dataset.update()
`
Track team progress from the SDK
The Argilla SDK now provides a way to retrieve data on annotation progress. This feature allows you to monitor the number of completed and pending records in a dataset and also the number of responses made by each user:
python
import argilla as rg

client = rg.Argilla(api_url="<api_url>", api_key="<api_key>")

dataset = client.datasets("my_dataset")

progress = dataset.progress(with_users_distribution=True)
`
The expected output looks like this:
json
{
"total": 100,
"completed": 50,
"pending": 50,
"users": {
"user1": {
"completed": { "submitted": 10, "draft": 5, "discarded": 5},
"pending": { "submitted": 5, "draft": 10, "discarded": 10},
},
"user2": {
"completed": { "submitted": 20, "draft": 10, "discarded": 5},
"pending": { "submitted": 2, "draft": 25, "discarded": 0},
},
...
}
`
Read more about this feature [here](https://docs.argilla.io/latest/how_to_guides/distribution/#track-your-teams-progress).

Automatic settings inference
When you import a dataset using the from_hub method, Argilla will automatically infer the settings, such as the fields and questions, based on the dataset Features. This will save you time and effort when working with datasets from the Hub.

python
import argilla as rg

client = rg.Argilla(api_url="<api_url>", api_key="<api_key>")

dataset = rg.Dataset.from_hub("yahma/alpaca-cleaned")
`

Task templates
We've added pre-built templates for common dataset types, including text classification, ranking, and rating tasks. These templates provide a starting point for your dataset creation, with pre-configured settings. You can use these templates to get started quickly, without having to configure everything from scratch.
python
import argilla as rg

client = rg.Argilla(api_url="<api_url>", api_key="<api_key>")

settings = rg.Settings.for_classification(labels=["positive", "negative"])

dataset = rg.Dataset(
name="my_dataset",
settings=settings,
client=client,
workspace="my_workspace",
)

dataset.create()
`
Read more about templates [here](https://docs.argilla.io/latest/reference/argilla/settings/settings/#creating-settings-using-built-in-templates).

**Full Changelog**: https://github.com/argilla-io/argilla/compare/v2.1.0...v2.2.0

2.1.0

🌟 Release highlights

Image Field
![Screenshot showing Argilla's new Image Field and Dark Mode](https://github.com/user-attachments/assets/b55c029b-0902-4f1a-ab99-fd68765975cb)
Argilla now supports multimodal datasets with the introduction of a native `ImageField`. This new type of field allows you to work seamlessly with image data, making it easier to annotate and curate datasets that combine text and images.

Here's an example of a dataset with an image field:
python

import argilla as rg

client = rg.Argilla(...)

settings = rg.Settings(
fields = [
rg.ImageField(name="image"),
rg.TextField(name="caption")
],
questions = [
rg.LabelQuestion(
name="good_or_bad",
title="Is the caption good or bad",
labels=["good", "bad"]
),
rg.TextQuestion(name="comments")
]
)

dataset = rg.Dataset(name="image_captions", settings=settings)
dataset.create()

record = rg.Record(
fields= {
"image": "https://docs.argilla.io/dev/assets/logo.svg",
"caption": "This is the Argilla logo"
}
)
dataset.records.log([record])

[Read more](https://docs.argilla.io/latest/how_to_guides/dataset/#fields)

Dark Mode
Argilla seems too bright for you? You can now try our new Dark Mode: a theme designed to reduce eye strain and give a new modern look to the app. You can enable Dark Mode under "My Settings".

Spanish Translation

<img width="1510" alt="Captura de pantalla 2024-09-05 a las 17 28 29" src="https://github.com/user-attachments/assets/0f82e3ce-3654-4e99-9055-db9173619f2f">

We're committed to making Argilla accessible to a broader audience. With the addition of Spanish translation, we're taking another step towards breaking language barriers and enabling more teams to collaborate on data curation projects.
There's nothing you need to do to enable it: Argilla will automatically switch to Spanish when your browser's main language is set to Spanish. ¡Disfrutadla!

Import any dataset from the Hugging Face Hub
The `from_hub` method just got a major boost! You can now input your own settings, allowing you to use this method with almost any dataset from the Hugging Face Hub, not just Argilla datasets.

Here's how easy it is to import a dataset from the Hub:
python
import argilla as rg

client = rg.Argilla(...)

settings = rg.Settings(
fields=[
rg.TextField(name="input"),
],
questions=[
rg.TextQuestion(name="output"),
],
)

dataset = rg.Dataset.from_hub(
repo_id="yahma/alpaca-cleaned",
settings=settings,
)

[Read more](https://docs.argilla.io/latest/reference/argilla/datasets/datasets/?h=from_hub#src.argilla.datasets._export._hub.HubImportExportMixin.from_hub)

Other Notable Fixes and Improvements

* Adaptable text areas for `TextQuestion`'s, providing a better user experience in the UI.
* Enhanced messaging for empty queues, keeping you informed when no records are available in the UI.

**Full Changelog**: https://github.com/argilla-io/argilla/compare/v2.0.1...v2.1.0

Page 2 of 22

Releases

Has known vulnerabilities

Previous Next

Argilla

Page 2 of 22

2.3.1

2.3.0

2.2.2

2.2.1

2.2.0

2.1.0

Page 2 of 22

Links

Releases