Fondant

Latest version: v1.0.0

Safety actively analyzes 642283 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 6 of 7

0.6.0

Not secure
Highlights

- **Vertex AI is now supported as a backend for pipeline execution.**

Simply run `fondant run vertex <pipeline.py>` to submit your pipeline.
Run `fondant run vertex --help` to see the possible configuration options.

- **The reusable components are now available on DockerHub under the `fndnt` organization.**

DockerHub is supported more broadly than Github container registry which we were using before.

- **Previously executed components are now cached when re-executed with the same arguments.**
- This makes it easier to iterate on development of down-stream components
- This allows you to resume failed pipelines from their failed step

- **Added `fondant build` command which let's you build fondant components easily**

Run `fondant build <component_dir>`. Check `fondant build -h` for options.
The command will also update the image reference in the `fondant_component.yaml` to the newly built one.

- **We migrated from KfP v1 to KfP v2**. This means:
- We now benefit from the latest KfP developments
- We compile fondant pipelines to the IR YAML format, which is supported by other execution engines such as Vertex
- You need a KfP v2 cluster to run fondant pipelines

Fixes

- Fix data explorer for usage on Windows
- Fix propagation of `client_kwargs` argument to configure Dask Client

Components

- Every reusable component now has a clear README describing its usage
- Add `load_from_parquet` component to load parquet files as input data
- Add `embed_text` component to embed documents and other text
- Add `chunk_text` component to chunk documents into passages
- Add `index_weaviate` component to index data in a weaviate vector store
- Fix issue with mixed type ids in LAION retrieval components
- Improve success rate of `download_images` component
- Fix OOM issues for inference components using GPU
- Limit data read by `load_from_hub` component to used columns

Detailed changes
* Add contribution segment by GeorgesLorre in https://github.com/ml6team/fondant/pull/463
* Update sample pipeline by mrchtr in https://github.com/ml6team/fondant/pull/464
* Update project description by RobbeSneyders in https://github.com/ml6team/fondant/pull/465
* Disable caching in the image retrieval sample pipeline by mrchtr in https://github.com/ml6team/fondant/pull/467
* Improve download images logs by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/466
* Add CC-25M announcement to docs by RobbeSneyders in https://github.com/ml6team/fondant/pull/468
* Update release announcements by mrchtr in https://github.com/ml6team/fondant/pull/471
* Add dataset link to press release by mrchtr in https://github.com/ml6team/fondant/pull/472
* Create load from parquet by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/474
* Fix caching writes by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/469
* Add caching dependency by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/479
* Add memory request and limit to components by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/482
* Improve hit rate of download images component by RobbeSneyders in https://github.com/ml6team/fondant/pull/470
* Cast id to string laion by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/485
* Bugfix partitioning by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/478
* Generate READMEs for all components using a script by RobbeSneyders in https://github.com/ml6team/fondant/pull/484
* Add component hub doc page by RobbeSneyders in https://github.com/ml6team/fondant/pull/487
* explorer small fix by Hakimovich99 in https://github.com/ml6team/fondant/pull/481
* Optimize GPU components by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/489
* Update Pillow to 10.0.1 to fix security issues by RobbeSneyders in https://github.com/ml6team/fondant/pull/493
* Update documentation regarding feedback by mrchtr in https://github.com/ml6team/fondant/pull/473
* Restructure-cli by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/488
* Add empty requirements.txt to load_from_parquet component by RobbeSneyders in https://github.com/ml6team/fondant/pull/504
* Use s3 client instead of http to access common crawl by mrchtr in https://github.com/ml6team/fondant/pull/501
* Fix run CLI by RobbeSneyders in https://github.com/ml6team/fondant/pull/507
* Migrate to KfpV2 by GeorgesLorre in https://github.com/ml6team/fondant/pull/477
* Remove abstract component test by mrchtr in https://github.com/ml6team/fondant/pull/510
* Only keep columns in produces by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/490
* Run black on components in pre-commit by RobbeSneyders in https://github.com/ml6team/fondant/pull/511
* Run bandit on components by RobbeSneyders in https://github.com/ml6team/fondant/pull/513
* Move container registry to DockerHub by RobbeSneyders in https://github.com/ml6team/fondant/pull/514
* Update component docs by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/516
* Vertex cli by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/519
* Refactor compile method for kfp and vertex by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/522
* Modify arg default by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/524
* Propagate `client_kwargs` argument and lower extract_images python version by RobbeSneyders in https://github.com/ml6team/fondant/pull/525
* Revert fsspec changes by mrchtr in https://github.com/ml6team/fondant/pull/523
* Add resource limits for Vertex by RobbeSneyders in https://github.com/ml6team/fondant/pull/529
* Update vertex and general docs by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/526
* Component/generate embeddings by tillwenke in https://github.com/ml6team/fondant/pull/520
* Add fondant build command by RobbeSneyders in https://github.com/ml6team/fondant/pull/527
* Fix explorer build script for DockerHub by RobbeSneyders in https://github.com/ml6team/fondant/pull/531
* Chunker component by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/528
* Update text embedding component by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/532
* Add IndexWeaviate component by tillwenke in https://github.com/ml6team/fondant/pull/521
* Build command: raise errors when pushing and make tag optional by RobbeSneyders in https://github.com/ml6team/fondant/pull/533
* Update component readmes by RobbeSneyders in https://github.com/ml6team/fondant/pull/538
* Add network argument to vertex runner by RobbeSneyders in https://github.com/ml6team/fondant/pull/537

New Contributors
* Hakimovich99 made their first contribution in https://github.com/ml6team/fondant/pull/481

**Full Changelog**: https://github.com/ml6team/fondant/compare/0.5.0...0.6.0

0.5.0

Not secure
What's Changed
* Small fixes explorer by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/446
* Add guides by GeorgesLorre in https://github.com/ml6team/fondant/pull/445
* Image retrieval sample pipeline by mrchtr in https://github.com/ml6team/fondant/pull/441
* Update readme by mrchtr in https://github.com/ml6team/fondant/pull/459
* Convert readme to html by mrchtr in https://github.com/ml6team/fondant/pull/460
* Update roadmap in readme by GeorgesLorre in https://github.com/ml6team/fondant/pull/462
* Bugfix/sample pipeline cc 25m by shayorshay in https://github.com/ml6team/fondant/pull/461


**Full Changelog**: https://github.com/ml6team/fondant/compare/0.4.0...0.5.0

0.4.0

Not secure
What's Changed
* Add missing nodepool label by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/389
* Implement caching by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/387
* Preserve divisions when writing and reading by RobbeSneyders in https://github.com/ml6team/fondant/pull/391
* Add commoncrawl pipeline that starts from warc paths by RobbeSneyders in https://github.com/ml6team/fondant/pull/392
* Standarize fsspec file access by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/397
* Correct default pipeline output by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/399
* Add output default to cli runner by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/405
* Update caching strategy by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/407
* [DataComp] Add T-MARS by NielsRogge in https://github.com/ml6team/fondant/pull/374
* Update image embedding component by NielsRogge in https://github.com/ml6team/fondant/pull/428
* Improve commoncrawl components by RobbeSneyders in https://github.com/ml6team/fondant/pull/403
* Detect pipeline attribute during compile/run by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/398
* Add option to setup preemptible VMs by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/408
* change meta estimation by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/409
* Update custom_component.md by tillwenke in https://github.com/ml6team/fondant/pull/425
* Handle different base paths explorer by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/427
* Incorporate dask client by mrchtr in https://github.com/ml6team/fondant/pull/410
* Set dask local scheduler as default by mrchtr in https://github.com/ml6team/fondant/pull/438
* Use auto_mkdir in fs_open calls by mrchtr in https://github.com/ml6team/fondant/pull/442

New Contributors
* tillwenke made their first contribution in https://github.com/ml6team/fondant/pull/425

**Full Changelog**: https://github.com/ml6team/fondant/compare/0.3.2...0.4.0

0.3.2

Not secure
What's Changed
* Add `index_column` and unique index creation to load_from_hf_hub component by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/345
* Add method to estimate caching key by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/318
* Adjust bandit settings by mrchtr in https://github.com/ml6team/fondant/pull/360
* Hide executor from users by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/362
* Create separate class for metadata by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/372
* Modify kfp command by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/378
* Modify gitignore by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/379
* Bugfix partitions Load from Hub by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/380
* Enable GPU for local runner by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/377
* Strip url in download_images before downloading by RobbeSneyders in https://github.com/ml6team/fondant/pull/383
* Redesign base path file structure by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/373
* [Datacomp] Add clean_captions and filter_clip_score components by alexanderremmerie in https://github.com/ml6team/fondant/pull/381
* Add disable caching argument by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/320
* Fix missing slash manifest evolution by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/385
* Set image pull policy to always by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/386
* Bugfix data-explorer images by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/382

New Contributors
* alexanderremmerie made their first contribution in https://github.com/ml6team/fondant/pull/381

**Full Changelog**: https://github.com/ml6team/fondant/compare/0.3.1...0.3.2

0.3.1

Not secure
What's Changed
* Add kfp constraint by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/341
* [DataComp] Update pipeline name, remove DockerCompiler by NielsRogge in https://github.com/ml6team/fondant/pull/340
* Deactivate dask string conversion by RobbeSneyders in https://github.com/ml6team/fondant/pull/349
* [Commoncrawl pipeline] Add component download_commoncrawl_segments by shayorshay in https://github.com/ml6team/fondant/pull/273
* Add kfp compiler by GeorgesLorre in https://github.com/ml6team/fondant/pull/291
* Remove node_pool_name arguments in examples by RobbeSneyders in https://github.com/ml6team/fondant/pull/350
* Small improvements to tox configuration by RobbeSneyders in https://github.com/ml6team/fondant/pull/343
* [CommonCrawl pipeline] Improve html extraction by mrchtr in https://github.com/ml6team/fondant/pull/351
* [DataComp] Add download images component by NielsRogge in https://github.com/ml6team/fondant/pull/348
* Add AWS credential arguments to commoncrawl download components. by mrchtr in https://github.com/ml6team/fondant/pull/353
* [LLM pipeline] Update text normalization component by mrchtr in https://github.com/ml6team/fondant/pull/335
* Remove output_partition_size argument and logic by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/355
* [Commoncrawl pipeline] Add metadata for target_language by shayorshay in https://github.com/ml6team/fondant/pull/357
* [Commoncrawl pipeline] Add offset to load component by shayorshay in https://github.com/ml6team/fondant/pull/358
* Remove obsolete args from ComponentOp by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/356
* Implement kfp runner with tests by GeorgesLorre in https://github.com/ml6team/fondant/pull/359
* Make download_component concurrent by RobbeSneyders in https://github.com/ml6team/fondant/pull/354
* Define kfp as extra and update error messages by GeorgesLorre in https://github.com/ml6team/fondant/pull/361
* Expand cli to support kfp compiling and running by GeorgesLorre in https://github.com/ml6team/fondant/pull/366
* Update docs with the new CLI commands by GeorgesLorre in https://github.com/ml6team/fondant/pull/370
* Update test setup of text_normalization component by RobbeSneyders in https://github.com/ml6team/fondant/pull/369


**Full Changelog**: https://github.com/ml6team/fondant/compare/0.3.0...0.3.1

0.3.0

Not secure
What's Changed
* [DataComp] Add cluster component by NielsRogge in https://github.com/ml6team/fondant/pull/239
* Enable building specified components by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/265
* Order output columns in PandasTransformComponent by RobbeSneyders in https://github.com/ml6team/fondant/pull/276
* Always pull images in local runner by RobbeSneyders in https://github.com/ml6team/fondant/pull/279
* Fix test warnings by RobbeSneyders in https://github.com/ml6team/fondant/pull/280
* Large scale controlnet by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/260
* Make components cloud agnostic by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/281
* Bump jsonschema version to 4.18.0 by RobbeSneyders in https://github.com/ml6team/fondant/pull/284
* Run tests against fondant package with tox by RobbeSneyders in https://github.com/ml6team/fondant/pull/283
* [LLM pipeline] Add filter out short texts component by mrchtr in https://github.com/ml6team/fondant/pull/247
* Fix running tox on the inferior OS by GeorgesLorre in https://github.com/ml6team/fondant/pull/287
* Update getting_started.md by janvanlooy in https://github.com/ml6team/fondant/pull/286
* Add defaults to components by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/289
* Remove obsolete packages by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/293
* Update pre-commit config with new folder structure by GeorgesLorre in https://github.com/ml6team/fondant/pull/294
* Add fsspec as explicit dependency by RobbeSneyders in https://github.com/ml6team/fondant/pull/299
* Revert src/fondant/components after testing with tox by RobbeSneyders in https://github.com/ml6team/fondant/pull/298
* Don't use from_registry for generic components by RobbeSneyders in https://github.com/ml6team/fondant/pull/285
* [LLM pipeline] MinHash generation for deduplication by mrchtr in https://github.com/ml6team/fondant/pull/295
* Split component implementation and execution by RobbeSneyders in https://github.com/ml6team/fondant/pull/302
* Bugfix default 0 values by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/304
* Update script to work with macos by GeorgesLorre in https://github.com/ml6team/fondant/pull/308
* Bugfix: Data explorer local runner usage by mrchtr in https://github.com/ml6team/fondant/pull/307
* Add --build-arg argument to compile and run commands by RobbeSneyders in https://github.com/ml6team/fondant/pull/306
* Bugfix: data explorer artifact mounting by mrchtr in https://github.com/ml6team/fondant/pull/310
* [Commoncrawl pipeline] Add component extract free-to-use images by shayorshay in https://github.com/ml6team/fondant/pull/282
* Introduce repartitioning by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/309
* Bugfix/partitioning by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/312
* Add code for reusable load from files component 290 by satishjasthi in https://github.com/ml6team/fondant/pull/296
* Unify manifest save path by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/322
* Bugfix basepath by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/324
* Add test cases for caption_images component and fixed bug in this com… by satishjasthi in https://github.com/ml6team/fondant/pull/311
* Remove local images in build script to conserve space by GeorgesLorre in https://github.com/ml6team/fondant/pull/326
* Change base image to smaller version by GeorgesLorre in https://github.com/ml6team/fondant/pull/330
* [Scripts] Fix build_components by NielsRogge in https://github.com/ml6team/fondant/pull/332
* Change subset merging method by PhilippeMoussalli in https://github.com/ml6team/fondant/pull/334
* Add node pool label by shayorshay in https://github.com/ml6team/fondant/pull/327
* Update docs link to stable version by RobbeSneyders in https://github.com/ml6team/fondant/pull/336
* Add int64 dtype by NielsRogge in https://github.com/ml6team/fondant/pull/338
* [load_from_hf_hub] Add dataset_length, set_index by NielsRogge in https://github.com/ml6team/fondant/pull/339

New Contributors
* janvanlooy made their first contribution in https://github.com/ml6team/fondant/pull/286
* satishjasthi made their first contribution in https://github.com/ml6team/fondant/pull/296

**Full Changelog**: https://github.com/ml6team/fondant/compare/0.2.1...0.3.0

Page 6 of 7

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.