|>=1.14|>=1.14|NEW serialization format! |
By introducing a new serialization format (which you will see in the Flyte console as `msgpack`), Flyte enables you to leverage robust data structures without writing glue code or sacrificing accuracy or reliability.
Notebooks support ([5907](https://github.com/flyteorg/flyte/issues/5907))
Now you can consume Flyte from a Jupyter Notebook (or any other Notebook) without recurring to any plugin. Using FlyteRemote, Flyte will automatically detect your requests coming from a Notebook environment and execute accordingly, giving Notebook’s users access to execution outputs, versioning, reproducibility, and all the infrastructure abstractions that Flyte provides.
Learn how it works in [this blog](https://medium.com/mecoli1219/develop-run-flyte-workflows-with-jupyter-notebook-163eaac2d363) by main contributor [mecoli1219](https://github.com/mecoli1219).
>Currently, dynamic workflows are not supported in Notebooks. This is a planned enhancement as part of the improved eager mode, coming out early next year.
Flyte now leverages asyncio to speed up executions ([2829](https://github.com/flyteorg/flytekit/pull/2829))
Both the type engine and the data persistence layer have been updated to support asynchronous, non-blocking I/O operations. These changes aim to improve the performance and scalability of I/O-bound operations. Examples include tasks that return large lists of FlyteFiles, which used to be serialized in batches but now benefit from better performance without any code changes.
Changed
Offloading of literals ([2872](https://github.com/flyteorg/flytekit/pull/2872))
Flyte automates data movement between tasks using gRPC as the communication protocol. When users need to move large amounts of data or use MapTasks that produce a large literal collection output, they typically hit a limit in the payload size gRPC can handle, getting an error like the following:
[LIMIT_EXCEEDED] limit exceeded. 2.903926mb > 2mb
This has forced users to split up MapTasks, refactoring their workflows to offload outputs to a FlyteFile or FlyteDirectory rather than returning literal values directly, or bumping up the `storage.limits.maxDownloadMBs` parameter to arbitrary sizes, leading to inconvenient or hard-to-maintain solutions.
For example, before upgrading flytekit, a simple workflow like the following:
python
task
def print_arrays(arr1: str) -> None:
print(f"Array 1: {arr1}")
task
def increase_size_of_arrays(n: int) -> str:
arr1 = 'a' * n * 1024
return arr1
Workflow: Orchestrate the tasks
fl.workflow
def simple_pipeline(n: int) -> int:
arr1 = increase_size_of_arrays(n=n)
print_arrays(arr1)
return 2
if __name__ == "__main__":
print(f"Running simple_pipeline() {simple_pipeline(n=11000)}")
Fails with the following message:
output is too large [11264029] bytes, max allowed [2097152] bytes
`flytekit >=1.14` automatically offloads to blob storage any object larger than 10Mb (the gRPC limit) allowing you to manage larger data and achieve higher degrees of parallelism effortlessly while continuing to use literal values.
After upgrading to 1.14, the above example runs and the outputs are stored in the metadata bucket:
s3://my-s3-bucket/metadata/propeller/flytesnacks-development-af5xxxkcqzzmnjhv2n4r/n0/data/0/outputs.pb]
This feature is enabled by default. If you need to turn it off, set `propeller.literalOffloadingConfigEnabled` to `false` in your Helm values.
> The role you use to authenticate to your infrastructure provider will need to have read access to the metadata bucket so flytekit can retrieve the offloaded literal.
> This feature won’t work if you use Flyte from a Jupyter Notebook or with fast registration (pyflyte run) or launching executions from the console. This is a planned future enhancement.
Breaking
BatchSize is removed ([2857](https://github.com/flyteorg/flytekit/pull/2857))
This change affects MapTasks that relied on the `PickleTransformer` and the `BatchSize` class to optimize the serial uploading of big lists.
It was removed because the feature was not widely used and the asynchronous handling of pickles, introduced in this release, reduces the need for batching.
ArrayNode is not experimental anymore ([2900](https://github.com/flyteorg/flytekit/pull/2900))
Considering [ArrayNode](https://docs.flyte.org/en/latest/user_guide/advanced_composition/map_tasks.html#arraynode) is the default MapTask since flytekit 1.12, the feature is no longer under `flytekit.experimental.arraynode` but it should be used as a base import like `flytekit.arraynode`
Full changelog
* Fix array node map task for offloaded literal by pmahindrakar-oss in https://github.com/flyteorg/flytekit/pull/2772
* Support default label/annotation for the default launch plan creating from workflow definition by Mecoli1219 in https://github.com/flyteorg/flytekit/pull/2776
* [FlyteClient][FlyteDeck] Get Downloaded Artifact Signed URL via Data Proxy by Future-Outlier in https://github.com/flyteorg/flytekit/pull/2777
* Expose Options in Flytekit for Direct User Access by Mecoli1219 in https://github.com/flyteorg/flytekit/pull/2785
* Adds a simple async utilitiy that managers an async loop in another thread by thomasjpfan in https://github.com/flyteorg/flytekit/pull/2784
* Adds a random DOCSEARCH_API_KEY to get monodocs build to succeed by thomasjpfan in https://github.com/flyteorg/flytekit/pull/2787
* Binary IDL With MessagePack by Future-Outlier in https://github.com/flyteorg/flytekit/pull/2760
* Pickle remote task for Jupyter Notebook Environment by Mecoli1219 in https://github.com/flyteorg/flytekit/pull/2733
* Fix getting started link, remove extra parenthesis by deepyaman in https://github.com/flyteorg/flytekit/pull/2788
* update bigquery plugin reqs by dansola in https://github.com/flyteorg/flytekit/pull/2790
* Related to flyteorg/flyte5805 [Flyte Deck] Extras has been added by 101rakibulhasan in https://github.com/flyteorg/flytekit/pull/2786
* Async type engine by wild-endeavor in https://github.com/flyteorg/flytekit/pull/2752
* add support for mapping over remote launch plans by pvditt in https://github.com/flyteorg/flytekit/pull/2761
* Fixes boundary conditions for literal convertor by kumare3 in https://github.com/flyteorg/flytekit/pull/2596
* Fix assertion in test_type_engine_binary_idl by thomasjpfan in https://github.com/flyteorg/flytekit/pull/2801
* Add unit test for pickling by wild-endeavor in https://github.com/flyteorg/flytekit/pull/2805
* Update task.py by RaghavMangla in https://github.com/flyteorg/flytekit/pull/2791
* Instance generic empty case (2802) by wild-endeavor in https://github.com/flyteorg/flytekit/pull/2807
* ensure a space is added if both args are set in ImageSpec by blaketastic2 in https://github.com/flyteorg/flytekit/pull/2806
* add links to register by dansola in https://github.com/flyteorg/flytekit/pull/2804
* Fix mypy errors caught in 1.11.2 by eapolinario in https://github.com/flyteorg/flytekit/pull/2808
* Fix dependabot alerts as of 2024-10-11 by eapolinario in https://github.com/flyteorg/flytekit/pull/2809
* Run active launchplan when available to launch, else run the latest one by kumare3 in https://github.com/flyteorg/flytekit/pull/2796
* Revise Pickle Remote Task for Jupyter Notebook Environment by Mecoli1219 in https://github.com/flyteorg/flytekit/pull/2799
* DOC-648 Add pages for Neptune and W&B plugins by neverett in https://github.com/flyteorg/flytekit/pull/2803
* [Flytekit] Envd builder with extra copy commands by mao3267 in https://github.com/flyteorg/flytekit/pull/2774
* More instance generic checks by wild-endeavor in https://github.com/flyteorg/flytekit/pull/2813
* Read inputs file from stdin iff the last argument to `pyflyte run` is a dash by eapolinario in https://github.com/flyteorg/flytekit/pull/2814
* Add PERIAN agent docs by otarabai in https://github.com/flyteorg/flytekit/pull/2816
* Fix docs URL in pyflyte init by bryan-hunted in https://github.com/flyteorg/flytekit/pull/2822
* Add FLYTE_FAIL_ON_ERROR env to the databricks job by pingsutw in https://github.com/flyteorg/flytekit/pull/2819
* [BUG] `BQToPandasDecodingHandler` only reads partial data when dataset > 100MB by mao3267 in https://github.com/flyteorg/flytekit/pull/2789
* Cross type binding by wild-endeavor in https://github.com/flyteorg/flytekit/pull/2820
* [BUG] fix is_optional_type to not return true for all union types by pvditt in https://github.com/flyteorg/flytekit/pull/2824
* Run tests on merges to release branches by eapolinario in https://github.com/flyteorg/flytekit/pull/2827
* Flytekitplugin pandera update: use entrypoint and structured dataset by cosmicBboy in https://github.com/flyteorg/flytekit/pull/2821
* Make sure user errors contain the entire chain of the stack trace by bgedik in https://github.com/flyteorg/flytekit/pull/2795
* Improved Type engine for generic types and performance by kumare3 in https://github.com/flyteorg/flytekit/pull/2815
* Supports importing modules in current path by kumare3 in https://github.com/flyteorg/flytekit/pull/2830
* Enable Resolve Attr Path for List or Dict of Promise by Mecoli1219 in https://github.com/flyteorg/flytekit/pull/2828
* Adds actual current working directory path by thomasjpfan in https://github.com/flyteorg/flytekit/pull/2832
* Catch mistake in structured dataset by wild-endeavor in https://github.com/flyteorg/flytekit/pull/2834
* Small change to clean up unit test. by wild-endeavor in https://github.com/flyteorg/flytekit/pull/2835
* Fix tree printing by wild-endeavor in https://github.com/flyteorg/flytekit/pull/2837
* handle case where error may not have args by blaketastic2 in https://github.com/flyteorg/flytekit/pull/2831
* Bump pyspark from 3.3.1 to 3.3.2 in /plugins/flytekit-greatexpectations by dependabot in https://github.com/flyteorg/flytekit/pull/2818
* Pull secrets from environment when running locally by thomasjpfan in https://github.com/flyteorg/flytekit/pull/2800
* Support executing launchplans from CLI by kumare3 in https://github.com/flyteorg/flytekit/pull/2839
* Add top-level access to FlyteRemote, FlyteFile, and FlyteDirectory and convenience class methods for FlyteRemote by granthamtaylor in https://github.com/flyteorg/flytekit/pull/2836
* Config for_endpoint doesn't respect config file by wild-endeavor in https://github.com/flyteorg/flytekit/pull/2843
* Union/enum handling by wild-endeavor in https://github.com/flyteorg/flytekit/pull/2845
* update docs for FlyteRemote by granthamtaylor in https://github.com/flyteorg/flytekit/pull/2847
* add great_tables renderer by cosmicBboy in https://github.com/flyteorg/flytekit/pull/2846
* Restrict Dynamic Workflow for Interactive Mode by Mecoli1219 in https://github.com/flyteorg/flytekit/pull/2849
* Async/data persistence by wild-endeavor in https://github.com/flyteorg/flytekit/pull/2829
* [TypeTransformer] Support frozen dataclasses by Future-Outlier in https://github.com/flyteorg/flytekit/pull/2823
* add class methods, unit tests for flytefile and flytedirectory by granthamtaylor in https://github.com/flyteorg/flytekit/pull/2852
* Remove pickle batching by wild-endeavor in https://github.com/flyteorg/flytekit/pull/2857
* Update comments in _make_dataclass_serializable by mao3267 in https://github.com/flyteorg/flytekit/pull/2856
* Added V5E tpu and slices to accelerators by pryce-turner in https://github.com/flyteorg/flytekit/pull/2838
* add `__hash__` method to `FlyteFile` to fix bug during interactive mode by granthamtaylor in https://github.com/flyteorg/flytekit/pull/2853
* Updated jupyter interaction by kumare3 in https://github.com/flyteorg/flytekit/pull/2858
* [Docs] Flytekit README link not working in the File an Issue section by 400Ping in https://github.com/flyteorg/flytekit/pull/2864
* Show traceback by default by pingsutw in https://github.com/flyteorg/flytekit/pull/2862
* Support Identifier in generate_console_url by thomasjpfan in https://github.com/flyteorg/flytekit/pull/2868
* Support overriding node metadata for array node by pvditt in https://github.com/flyteorg/flytekit/pull/2865
* Fix Jupyter Versioning by Mecoli1219 in https://github.com/flyteorg/flytekit/pull/2866
* improved output handling in notebooks by kumare3 in https://github.com/flyteorg/flytekit/pull/2869
* Restrict Eager Task for Interactive Mode by Mecoli1219 in https://github.com/flyteorg/flytekit/pull/2871
* Async/Batching of coroutines by wild-endeavor in https://github.com/flyteorg/flytekit/pull/2855
* fix enum type assertion with python versions less than 3.12 by dansola in https://github.com/flyteorg/flytekit/pull/2873
* Pydantic Transformer V2 by Future-Outlier in https://github.com/flyteorg/flytekit/pull/2792
* Add support for ContainerTask in PERIAN agent + os-storage parameter by otarabai in https://github.com/flyteorg/flytekit/pull/2867
* Restrict Python Version Mismatch between Pickled Object and Remote Envrionment by Mecoli1219 in https://github.com/flyteorg/flytekit/pull/2848
* Default nb task resolver msg by cosmicBboy in https://github.com/flyteorg/flytekit/pull/2889
* [BUG] `Blob` `uri` isn't converted to `str` when source path is used as `uri` by JiangJiaWei1103 in https://github.com/flyteorg/flytekit/pull/2881
* Remove `_fix_structured_dataset_type` to deprecate python 3.8 by Future-Outlier in https://github.com/flyteorg/flytekit/pull/2893
* Agent - missing type hint by wild-endeavor in https://github.com/flyteorg/flytekit/pull/2896
* Map/setup exec by wild-endeavor in https://github.com/flyteorg/flytekit/pull/2898
* Async/exists check should use async function by wild-endeavor in https://github.com/flyteorg/flytekit/pull/2901
* [Client][API] get control plane version by Future-Outlier in https://github.com/flyteorg/flytekit/pull/2874
* Remove array node map task from experimental by eapolinario in https://github.com/flyteorg/flytekit/pull/2900
* Make it easier to use commands with uv by thomasjpfan in https://github.com/flyteorg/flytekit/pull/2897
* Kill the vscode server itself when resume the task by pingsutw in https://github.com/flyteorg/flytekit/pull/2890
* Disable pytest live logs by eapolinario in https://github.com/flyteorg/flytekit/pull/2905
* [MSGPACK IDL] Gate feature by setting ENV by Future-Outlier in https://github.com/flyteorg/flytekit/pull/2894
* pod template inplace operations by dansola in https://github.com/flyteorg/flytekit/pull/2899
* Type Mismatching while Serializing Dataclass with Union by mao3267 in https://github.com/flyteorg/flytekit/pull/2859
* [Core feature] Flytekit should support `unsafe` mode for types by Mecoli1219 in https://github.com/flyteorg/flytekit/pull/2419
* Adds support for wait for execution with a configurable interval by kumare3 in https://github.com/flyteorg/flytekit/pull/2913
* [Housekeeping] stop support the python3.8 by Terryhung in https://github.com/flyteorg/flytekit/pull/2909
* Improve error message for nested tasks by pingsutw in https://github.com/flyteorg/flytekit/pull/2910
* Fix Flyte Types Upload Issues in Default Input by Future-Outlier in https://github.com/flyteorg/flytekit/pull/2907
* Bump snowflake-connector-python from 3.12.1 to 3.12.3 by dependabot in https://github.com/flyteorg/flytekit/pull/2863
* Add memray plugin by fiedlerNr9 in https://github.com/flyteorg/flytekit/pull/2875
* [Test] Cleanup tmp dirs in test dir by JiangJiaWei1103 in https://github.com/flyteorg/flytekit/pull/2917
* [Test] Use context manager for auto tmp dirs cleanup by JiangJiaWei1103 in https://github.com/flyteorg/flytekit/pull/2922
* Make Flytefile and Flytedirectory's copilot local execution work correctly by wayner0628 in https://github.com/flyteorg/flytekit/pull/2887
* Copy ca-certificates from micromamba by thomasjpfan in https://github.com/flyteorg/flytekit/pull/2923
* [Flytekit][BUG] Failed to serialize FlyteTypes within Union alongside other non-None variants by mao3267 in https://github.com/flyteorg/flytekit/pull/2918
* Decouple ray submitter, worker, and head resources by Sovietaced in https://github.com/flyteorg/flytekit/pull/2924
* Multiple error files support for entry point by bgedik in https://github.com/flyteorg/flytekit/pull/2797
* Add Flyte Backend Version to pyflyte info command by davidlin20dev in https://github.com/flyteorg/flytekit/pull/2938
* Update pytest skip condition for delete_on_close to Python 3.12 by davidlin20dev in https://github.com/flyteorg/flytekit/pull/2939
* [BUG] Support creation and reading of StructuredDataset with local or remote uri by JiangJiaWei1103 in https://github.com/flyteorg/flytekit/pull/2914
* [BUG] Convert protobuf to literal as remote exec by JiangJiaWei1103 in https://github.com/flyteorg/flytekit/pull/2925
* Offload literals by eapolinario in https://github.com/flyteorg/flytekit/pull/2872
* fix: catch API errors for tags as well as whole image repos by dylanspag-lmco in https://github.com/flyteorg/flytekit/pull/2945
* Bump apache-airflow from 2.10.2 to 2.10.3 in /plugins/flytekit-airflow by dependabot in https://github.com/flyteorg/flytekit/pull/2915
* Bump aiohttp from 3.9.4 to 3.10.11 in /plugins/flytekit-spark by dependabot in https://github.com/flyteorg/flytekit/pull/2936
* Move copy up in image spec by thomasjpfan in https://github.com/flyteorg/flytekit/pull/2948
* Skip mashumaro 3.15 temporarily by eapolinario in https://github.com/flyteorg/flytekit/pull/2957
* Add support for refresh_token to device flow auth by Sovietaced in https://github.com/flyteorg/flytekit/pull/2947
* [BUG] Timezone difference in error timestamp unit tests by mao3267 in https://github.com/flyteorg/flytekit/pull/2953
* The container image of spark task should be immutable by pingsutw in https://github.com/flyteorg/flytekit/pull/2956
* [TypeEngine] Schema version priority for union dataclass comparison by mao3267 in https://github.com/flyteorg/flytekit/pull/2959
* chore: Replace print statement with logger.debug in type_engine.py by davidlin20dev in https://github.com/flyteorg/flytekit/pull/2966
* [Bug][TypeEngine] Unit tests for dataclass in union with more than two non-None variants by mao3267 in https://github.com/flyteorg/flytekit/pull/2952
* [BUG] Use models literal StructuredDataset to enable sd bypass task by JiangJiaWei1103 in https://github.com/flyteorg/flytekit/pull/2954
* Auto activating launchplans that have been marked to be auto activated by kumare3 in https://github.com/flyteorg/flytekit/pull/2968
* mashumaro>=3.15 by Future-Outlier in https://github.com/flyteorg/flytekit/pull/2970
* Add ImageSpec.from_env by thomasjpfan in https://github.com/flyteorg/flytekit/pull/2895
* Update array node map task to support an additional plugin by pvditt in https://github.com/flyteorg/flytekit/pull/2934
* Adds experimental support for uv.lock in ImageSpec by thomasjpfan in https://github.com/flyteorg/flytekit/pull/2929
* mount cache dir to the NIM model server container by samhita-alla in https://github.com/flyteorg/flytekit/pull/2965
* Skip clear-action-cache gh action on windows by eapolinario in https://github.com/flyteorg/flytekit/pull/2951
* Skip grpcio versions due to unwanted output by eapolinario in https://github.com/flyteorg/flytekit/pull/2977
* Set map task metadata only for subnode by pvditt in https://github.com/flyteorg/flytekit/pull/2979
* Skip socket files during fast registration by eapolinario in https://github.com/flyteorg/flytekit/pull/2980
* Add cache key metadata by eapolinario in https://github.com/flyteorg/flytekit/pull/2974
* Move digest computation by thomasjpfan in https://github.com/flyteorg/flytekit/pull/2983
* Follow up to 2980 - Do not copy socket files by eapolinario in https://github.com/flyteorg/flytekit/pull/2984
* remove master idl by wild-endeavor in https://github.com/flyteorg/flytekit/pull/2988
* Use `sys.base_prefix` and `sys.prefix` to filter out modules in fast registry by thomasjpfan in https://github.com/flyteorg/flytekit/pull/2985
New Contributors
* deepyaman made their first contribution in https://github.com/flyteorg/flytekit/pull/2788
* 101rakibulhasan made their first contribution in https://github.com/flyteorg/flytekit/pull/2786
* RaghavMangla made their first contribution in https://github.com/flyteorg/flytekit/pull/2791
* blaketastic2 made their first contribution in https://github.com/flyteorg/flytekit/pull/2806
* bryan-hunted made their first contribution in https://github.com/flyteorg/flytekit/pull/2822
* 400Ping made their first contribution in https://github.com/flyteorg/flytekit/pull/2864
* JiangJiaWei1103 made their first contribution in https://github.com/flyteorg/flytekit/pull/2881
* Terryhung made their first contribution in https://github.com/flyteorg/flytekit/pull/2909
* Sovietaced made their first contribution in https://github.com/flyteorg/flytekit/pull/2924
* davidlin20dev made their first contribution in https://github.com/flyteorg/flytekit/pull/2938
**Full Changelog**: https://github.com/flyteorg/flytekit/compare/v1.13.7...v1.14.0