Distilabel

Latest version: v1.0.3

Safety actively analyzes 623616 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 3

1.0.3

What's Changed
* Add `stop` and `stop_sequences` in `LLM.generate` subclasses by alvarobartt in https://github.com/argilla-io/distilabel/pull/585


**Full Changelog**: https://github.com/argilla-io/distilabel/compare/1.0.2...1.0.3

1.0.2

What's Changed

* Fix `RuntimeParamater` validation when provided as `_Step` attr by alvarobartt in https://github.com/argilla-io/distilabel/pull/564
* Add `seed` with `random.randint` to ensure cache is not used by alvarobartt in https://github.com/argilla-io/distilabel/pull/571

**Full Changelog**: https://github.com/argilla-io/distilabel/compare/1.0.1...1.0.2

1.0.1

What's Changed
* Fix typo in readme and remove the ToArgilla step by dvsrepo in https://github.com/argilla-io/distilabel/pull/548
* Fix `model_validator` in `InferenceEndpoints` due to `Pipeline` pickling by alvarobartt in https://github.com/argilla-io/distilabel/pull/552


**Full Changelog**: https://github.com/argilla-io/distilabel/compare/1.0.0...1.0.1

1.0.0

What's Changed
* Add `Step` abstract class and new `Pipeline` by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/338
* Add runtime parameters validation by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/345
* Pipeline local execution by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/346
* Add `Task` (minimal implementation) by alvarobartt in https://github.com/argilla-io/distilabel/pull/347
* Refactor `_BatchManager` to have list of batches per step by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/353
* Refactor getting parameters from `Step.process` method by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/355
* Add `LLM`, `OpenAILLM`, `TransformersLLM`, and `LlamaCppLLM` by alvarobartt in https://github.com/argilla-io/distilabel/pull/354
* Fix `Task` and `TextGeneration` by alvarobartt in https://github.com/argilla-io/distilabel/pull/356
* Add `combine_dicts` function and `CombineColumns` class by alvarobartt in https://github.com/argilla-io/distilabel/pull/358
* Add `PushToHub` step and fix `typing` by alvarobartt in https://github.com/argilla-io/distilabel/pull/357
* Add serialization for the new components by plaguss in https://github.com/argilla-io/distilabel/pull/349
* Fix `OpenAILLM.api_key` due to `SecretStr` and `StepInput` wrong imports by alvarobartt in https://github.com/argilla-io/distilabel/pull/359
* Add `GlobalStep`, fix `_BatchManager`, and add `logging` by alvarobartt in https://github.com/argilla-io/distilabel/pull/362
* Migrate vllm to the new API by plaguss in https://github.com/argilla-io/distilabel/pull/361
* Update `_BatchManager` to work with `GlobalStep`s and `input_batch_size` per step by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/366
* Clean up outdated / unused files by alvarobartt in https://github.com/argilla-io/distilabel/pull/369
* Add `input_mappings` and `output_mappings` attributes by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/367
* Move batching from `Task` to `LLM`, fix `vLLM.generate` and add `DISTILABEL_LOG_LEVEL` by alvarobartt in https://github.com/argilla-io/distilabel/pull/371
* Improve runtime parameter definition by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/372
* Add `AsyncOpenAI` and update `OpenAILLM` accordingly by alvarobartt in https://github.com/argilla-io/distilabel/pull/381
* Update serde by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/382
* Add `MistralLLM` and add `generation_kwargs` as `RuntimeParameters` by alvarobartt in https://github.com/argilla-io/distilabel/pull/383
* Move `steps` out of `pipeline` by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/384
* Add tests and docstring for `Task` and subclasses by alvarobartt in https://github.com/argilla-io/distilabel/pull/385
* Add `step` decorator by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/387
* Add `input` propagation through `Task.process` by alvarobartt in https://github.com/argilla-io/distilabel/pull/399
* Improve `Pipeline` error handling by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/400
* Fix `combine_dicts` and `StepInput` import in `PushToHub` by alvarobartt in https://github.com/argilla-io/distilabel/pull/401
* Improve `GlobalStep` error handling by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/402
* Changed " by italics in EvolInstruct tutorial where one "" was missing by ignacioct in https://github.com/argilla-io/distilabel/pull/398
* Add `get_last_hidden_states` method and update `TransformersLLM` by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/414
* docs: correct small typos in tutorial by sdiazlor in https://github.com/argilla-io/distilabel/pull/419
* docs: readme positioning by davidberenstein1957 in https://github.com/argilla-io/distilabel/pull/386
* Add `num_generations` and `group_generations` parameters to `Task` by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/416
* Add `Argilla` and `PromptCompletionToArgilla` by alvarobartt in https://github.com/argilla-io/distilabel/pull/420
* Add `EvolInstruct` and `EvolInstructGenerator` tasks by alvarobartt in https://github.com/argilla-io/distilabel/pull/407
* Wrap optional `LLM` dependencies under `load` by alvarobartt in https://github.com/argilla-io/distilabel/pull/428
* Add `ComplexityScorer` task by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/421
* Implement caching mechanism for the pipelines by plaguss in https://github.com/argilla-io/distilabel/pull/370
* Add method to Pipeline to handle keyboard interruptions via ctrl+c by plaguss in https://github.com/argilla-io/distilabel/pull/406
* Add `GenerateEmbeddings` task by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/427
* Add `api_key` within `LLM.load` and add `llm_kwargs` as `RuntimeParameter` by alvarobartt in https://github.com/argilla-io/distilabel/pull/432
* Add `GeneratorStep.process` validation in `DAG` and smaller fixes by alvarobartt in https://github.com/argilla-io/distilabel/pull/435
* Add `EvolComplexity` task by davidberenstein1957 in https://github.com/argilla-io/distilabel/pull/415
* Add `QualityScorer` Task by ignacioct in https://github.com/argilla-io/distilabel/pull/425
* Add `CudaDevicePlacementMixin` class by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/436
* Return `distiset` from `Pipeline.run` by plaguss in https://github.com/argilla-io/distilabel/pull/417
* Update README.md by strickvl in https://github.com/argilla-io/distilabel/pull/451
* Add `InferenceEndpointsLLM` by alvarobartt in https://github.com/argilla-io/distilabel/pull/439
* Fix `Distiset` after `PushToHub` and smaller fixes by alvarobartt in https://github.com/argilla-io/distilabel/pull/452
* Fix `Step.process_applying_mappings` by alvarobartt in https://github.com/argilla-io/distilabel/pull/453
* Add `AnyscaleLLM` by davidberenstein1957 in https://github.com/argilla-io/distilabel/pull/447
* Add general function to obtain schema for parquet writer by plaguss in https://github.com/argilla-io/distilabel/pull/454
* Add `TogetherLLM` by davidberenstein1957 in https://github.com/argilla-io/distilabel/pull/449
* Fix `LLM` subclasses based on `OpenAILLM` by alvarobartt in https://github.com/argilla-io/distilabel/pull/455
* Improve batching and caching by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/457
* Add `EvolQuality` task by davidberenstein1957 in https://github.com/argilla-io/distilabel/pull/429
* Add `VertexAILLM` by davidberenstein1957 in https://github.com/argilla-io/distilabel/pull/445
* Add `use_cache` to `BasePipeline` by plaguss in https://github.com/argilla-io/distilabel/pull/463
* Add `AnthropicLLM` by sdiazlor in https://github.com/argilla-io/distilabel/pull/444
* Add `multiprocess` dependency by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/467
* Add `UltraFeedback` by alvarobartt in https://github.com/argilla-io/distilabel/pull/464
* Add `OllamaLLM` by davidberenstein1957 in https://github.com/argilla-io/distilabel/pull/405
* Add `RuntimeParametersMixin` and `LLM` runtime parameters by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/466
* Add `LiteLLM` by davidberenstein1957 in https://github.com/argilla-io/distilabel/pull/441
* Add CLI by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/471
* Set `_batch_manager` to `None` after run by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/473
* Add create_distiset function by plaguss in https://github.com/argilla-io/distilabel/pull/480
* Add `overload` to `step` decorator by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/474
* Move Enum to Dict[str, str] to avoid serialization errors during caching by plaguss in https://github.com/argilla-io/distilabel/pull/482
* Include a dataset card and the `pipeline.yaml` on `Distiset.push_to_hub` by plaguss in https://github.com/argilla-io/distilabel/pull/479
* Add `PairRM` task for ranking responses by plaguss in https://github.com/argilla-io/distilabel/pull/450
* Update `_WriteBuffer` to write several parquet files by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/483
* Extend `Argilla` integration `TextGeneration`, `Preference`, and more by alvarobartt in https://github.com/argilla-io/distilabel/pull/472
* Add `DeitaFiltering` step by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/481
* Add `InstructionBacktranslation` by alvarobartt in https://github.com/argilla-io/distilabel/pull/486
* Fix huggingface_hub TextGenerationError import by Wauplin in https://github.com/argilla-io/distilabel/pull/485
* Improve azure openai support by BramVanroy in https://github.com/argilla-io/distilabel/pull/461
* Add `SelfInstruct` task by ignacioct in https://github.com/argilla-io/distilabel/pull/456
* Use `QueueHandler` for `Pipeline` logging by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/489
* Improve `_stop` and `logging` by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/491
* Fix creating empty `Dataset` in `create_distiset` function by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/492
* Add imports from `__init__` modules by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/493
* `batch_size` and `input_batch_size` runtime parameters by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/495
* Update serialization method of _BatchManager to write each step on its own file by plaguss in https://github.com/argilla-io/distilabel/pull/496
* Fix `asyncio` in `AsyncLLM` to use the running event loop if any by alvarobartt in https://github.com/argilla-io/distilabel/pull/501
* Added authentication header to allow private/gated dataset use by bjoernpl in https://github.com/argilla-io/distilabel/pull/498
* Fix generator yielding batches all at once if `batch_size` == `input_batch_size` by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/510
* Run output queue loop in thread and improve stop by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/511
* Update `docs` for `distilabel` v1.0 with `mkdocs-material` by plaguss in https://github.com/argilla-io/distilabel/pull/476
* Add `CohereLLM` by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/508
* `distilabel` v1.0 by alvarobartt in https://github.com/argilla-io/distilabel/pull/352
* Remove draft comment by plaguss in https://github.com/argilla-io/distilabel/pull/515
* Fix `docs/sections/papers/*.md` and add example in `docs/index.md` by alvarobartt in https://github.com/argilla-io/distilabel/pull/516
* Small fixes for the docs (images and nav bar) by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/519
* Fix CTRL + C when still loading steps by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/521
* Empty input queues when `CTRL + C` by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/528
* Add `filelock` and `flash-attn` to `vllm` extra by alvarobartt in https://github.com/argilla-io/distilabel/pull/529
* Fix error in README.md when pushing the custom dataset card by plaguss in https://github.com/argilla-io/distilabel/pull/530
* Fix pipeline stuck when empty batches by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/531
* Add `EvolQuality` to `tasks.__init__.py` by davidberenstein1957 in https://github.com/argilla-io/distilabel/pull/525
* Show information about subprocess exception by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/532
* Update `TextGeneration.format_input` method to allow OpenAI format by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/533
* Improve create_distiset by plaguss in https://github.com/argilla-io/distilabel/pull/534
* Fixes regarding `RuntimeParameter`s and `pydantic` model attributes by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/535
* Fix parsing `LLM` generation kwargs by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/537
* pass on Distiset's kwargs to Dataset.push_to_hub() by rasdani in https://github.com/argilla-io/distilabel/pull/522
* Set `config="default"` in `Distiset` when only one leaf `Step` by alvarobartt in https://github.com/argilla-io/distilabel/pull/540
* docs: update documentation for huggingface inference endpoints. by burtenshaw in https://github.com/argilla-io/distilabel/pull/539
* Remove `flash-attn` from `vllm` extra by alvarobartt in https://github.com/argilla-io/distilabel/pull/542
* Docs fix argilla imports by burtenshaw in https://github.com/argilla-io/distilabel/pull/541
* Fix not all exceptions being able to be pickled by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/543
* Update CLI example by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/544
* Check that `Step.name` doesn't contain dots or spaces by gabrielmbmb in https://github.com/argilla-io/distilabel/pull/545

New Contributors
* strickvl made their first contribution in https://github.com/argilla-io/distilabel/pull/451
* Wauplin made their first contribution in https://github.com/argilla-io/distilabel/pull/485
* BramVanroy made their first contribution in https://github.com/argilla-io/distilabel/pull/461
* bjoernpl made their first contribution in https://github.com/argilla-io/distilabel/pull/498
* rasdani made their first contribution in https://github.com/argilla-io/distilabel/pull/522

**Full Changelog**: https://github.com/argilla-io/distilabel/compare/0.6.0...1.0.0

0.6.0

What's Changed
* Fix typo in docstring of to_argilla metrics_ to metric_ by burtenshaw in https://github.com/argilla-io/distilabel/pull/334
* Implement a JSON responding OpenAI LLM as JSONOpenAILLM by burtenshaw in https://github.com/argilla-io/distilabel/pull/331
* Add examples for the deita paper tasks by plaguss in https://github.com/argilla-io/distilabel/pull/329
* Add checkpoint strategy to automatically push to hub by plaguss in https://github.com/argilla-io/distilabel/pull/321
* docs: update tutorials avoid argilla installation error by sdiazlor in https://github.com/argilla-io/distilabel/pull/337
* Fix `CustomDataset.load_from_disk` with `str`/`Path` objects by plaguss in https://github.com/argilla-io/distilabel/pull/341
* Clalrify number of generations produced when using LLMPool in docs by davanstrien in https://github.com/argilla-io/distilabel/pull/339
* Refactor _build_dataset piece for speed by plaguss in https://github.com/argilla-io/distilabel/pull/344
* Fix documentation and type variables in `CustomDataset` checkpoint methods by plaguss in https://github.com/argilla-io/distilabel/pull/342
* US Spelling and other typo correction on Distilabel tutorials by ignacioct in https://github.com/argilla-io/distilabel/pull/324
* docs: add a tutorial for evolinstruct by sdiazlor in https://github.com/argilla-io/distilabel/pull/327
* Fix Openai api error with OpenAI-compatible providers by jphme in https://github.com/argilla-io/distilabel/pull/351
* Add fix for labels not returned by openai api by plaguss in https://github.com/argilla-io/distilabel/pull/364
* Refactor model availability check in is_serverless_endpoint_available by davanstrien in https://github.com/argilla-io/distilabel/pull/363

New Contributors
* burtenshaw made their first contribution in https://github.com/argilla-io/distilabel/pull/334
* jphme made their first contribution in https://github.com/argilla-io/distilabel/pull/351

**Full Changelog**: https://github.com/argilla-io/distilabel/compare/0.5.0...0.6.0

0.5.0

What's Changed
* fix: Correct import error by plaguss in https://github.com/argilla-io/distilabel/pull/279
* fix: Filter examples for which len generations != len ratings by plaguss in https://github.com/argilla-io/distilabel/pull/284
* feat: Add sentence transformers support for the to argilla method by davidberenstein1957 in https://github.com/argilla-io/distilabel/pull/262
* feat: Add text descriptives support to the to argilla methods by davidberenstein1957 in https://github.com/argilla-io/distilabel/pull/271
* feat: Add `to_argilla` method to `EvolInstructTask` generated datasets by plaguss in https://github.com/argilla-io/distilabel/pull/291
* docs: Shorten titles tutorials and update core example by davidberenstein1957 in https://github.com/argilla-io/distilabel/pull/289
* feat: Add new serialization strategy by plaguss in https://github.com/argilla-io/distilabel/pull/288
* feat: Review `OllamaLLM` and `TogetherInferenceLLM` by alvarobartt in https://github.com/argilla-io/distilabel/pull/305
* refactor: Remove Metadata for Ratings by ignacioct in https://github.com/argilla-io/distilabel/pull/303
* docs: Add missing VertexAI information within `README.md` and `docs/index.md` by alvarobartt in https://github.com/argilla-io/distilabel/pull/308
* feat: Add functionality to push tasks to the HuggingFace hub and download them automatically. by plaguss in https://github.com/argilla-io/distilabel/pull/297
* feat: Add `ComplexityScorer` and `QualityScorer` tasks from Deita by plaguss in https://github.com/argilla-io/distilabel/pull/302
* fix: Fix logging visualization of labeller pipelines by plaguss in https://github.com/argilla-io/distilabel/pull/310
* feat: Add `Improving Text Embeddings with LLMs` tutorial by alvarobartt in https://github.com/argilla-io/distilabel/pull/313
* feat: Add `EvolComplexity` and `EvolQuality` by davidberenstein1957 in https://github.com/argilla-io/distilabel/pull/299
* feat: Add `validate_prompts` method to LLMs to help validating the prompts by plaguss in https://github.com/argilla-io/distilabel/pull/314
* fix: typo in clean an existing preference dataset by sdiazlor in https://github.com/argilla-io/distilabel/pull/312
* feat: Add new column for sft fine tuning with `prepare_dataset` by plaguss in https://github.com/argilla-io/distilabel/pull/309
* docs: Custom Task Documentation by ignacioct in https://github.com/argilla-io/distilabel/pull/275
* refactor: Align the `LLM` subclasses args by alvarobartt in https://github.com/argilla-io/distilabel/pull/315
* feat: Include rationale of the model responses on `prepare_dataset` if available by plaguss in https://github.com/argilla-io/distilabel/pull/317
* feat: Add embedding tutorial to docs by ignacioct in https://github.com/argilla-io/distilabel/pull/319
* feat: Add `MistralAILLM` by plaguss in https://github.com/argilla-io/distilabel/pull/293
* feat: Use `ollama` Python client within `OllamaLLM` by sdiazlor in https://github.com/argilla-io/distilabel/pull/307


**Full Changelog**: https://github.com/argilla-io/distilabel/compare/0.4.0...0.5.0

Page 1 of 3

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.