⚡️ Introducing the dataset-first interface
We have removed the pipeline interface and redesigned the dataset class. Datasets can still be built using load components as before. Now, you have to use the `Dataset` class instead of the `Pipeline`.
python
from fondant.dataset import Dataset
dataset = Dataset.create(
"load_from_parquet",
arguments={
...
},
)
dataset = dataset.apply(...)
Additionally, we now support initializing datasets from previous workflow runs, which allows you to share your Fondant datasets. Datasets can be initialized using manifests. To share a dataset, you can easily share manifest files.
python
from fondant.dataset import Dataset
dataset = Dataset.read("gs://.../manifest.json")
dataset = dataset.apply(...)
**🛠️ Working directory**
Since the pipeline doesn’t exist anymore, we added a new cli command to define a working directory. In the working directory all the workflow related artifacts will be stored.
bash
fondant run local dataset --working-directory ./data
**⚠️ Attention:**
Fondant pipelines created with previous Fondant versions are no longer compatible with >=0.12.0. To migrate your existing pipelines, initialize your dataset using `Dataset.create(...)` instead of `Pipeline.read(...)` and use the former `base_path` as the working directory when you materialize your dataset.
What's Changed
* Refactor pipeline interface by mrchtr in https://github.com/ml6team/fondant/pull/901
* Update dataset documentation by mrchtr in https://github.com/ml6team/fondant/pull/918
* Remove pipeline references by mrchtr in https://github.com/ml6team/fondant/pull/923
* Update documentation dataset first interface by mrchtr in https://github.com/ml6team/fondant/pull/921
* Empty produces leading into list index out of range by mrchtr in https://github.com/ml6team/fondant/pull/924
* Remove working directory from user arguments by mrchtr in https://github.com/ml6team/fondant/pull/925
* Fix navigation documentation by mrchtr in https://github.com/ml6team/fondant/pull/926
* Fix link in the README file by Philmod in https://github.com/ml6team/fondant/pull/930
* Update readme with dataset focus by GeorgesLorre in https://github.com/ml6team/fondant/pull/928
* Mount absolute path of working dir to local runner by mrchtr in https://github.com/ml6team/fondant/pull/931
* Fixing cicd by mrchtr in https://github.com/ml6team/fondant/pull/929
* Fix arch link in readme by GeorgesLorre in https://github.com/ml6team/fondant/pull/933
* Set session duration to 5h in prep release pipeline by mrchtr in https://github.com/ml6team/fondant/pull/934
New Contributors
* Philmod made their first contribution in https://github.com/ml6team/fondant/pull/930
**Full Changelog**: https://github.com/ml6team/fondant/compare/0.11.2...0.12.0