Dagster

Latest version: v1.8.7

Safety actively analyzes 663899 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 23 of 49

1.0.0

Not secure
Major Changes

- A docs site overhaul! Along with tons of additional content, the existing pages have been significantly edited and reorganized to improve readability.
- All Dagster [examples](https://github.com/dagster-io/dagster/tree/master/examples)[](https://github.com/dagster-io/dagster/tree/master/examples) are revamped with a consistent project layout, descriptive names, and more helpful README files.
- A new `dagster project `CLI contains commands for bootstrapping new Dagster projects and repositories:
- `dagster project scaffold` creates a folder structure with a single Dagster repository and other files such as workspace.yaml. This CLI enables you to quickly start building a new Dagster project with everything set up.
- `dagster project from-example` downloads one of the Dagster examples. This CLI helps you to quickly bootstrap your project with an officially maintained example. You can find the available examples via `dagster project list-examples`.
- Check out [Create a New Project](https://docs.dagster.io/getting-started/create-new-project) for more details.
- A `default_executor_def` argument has been added to the `repository` decorator. If specified, this will be used for any jobs (asset or op) which do not explicitly set an `executor_def`.
- A `default_logger_defs` argument has been added to the `repository` decorator, which works in the same way as `default_executor_def`.
- A new `execute_job` function presents a Python API for kicking off runs of your jobs.
- Run status sensors may now yield `RunRequests`, allowing you to kick off a job in response to the status of another job.
- When loading an upstream asset or op output as an input, you can now set custom loading behavior using the `input_manager_key` argument to AssetIn and In.
- In the UI, the global lineage graph has been brought back and reworked! The graph keeps assets in the same group visually clustered together, and the query bar allows you to visualize a custom slice of your asset graph.

Breaking Changes and Deprecations

Legacy API Removals

In 1.0.0, a large number of previously-deprecated APIs have been fully removed. A full list of breaking changes and deprecations, alongside instructions on how to migrate older code, can be found in [MIGRATION.md](https://github.com/dagster-io/dagster/blob/master/MIGRATION.md). At a high level:

- The `solid` and `pipeline` APIs have been removed, along with references to them in extension libraries, arguments, and the CLI _(deprecated in `0.13.0)`_.
- The `AssetGroup` and `build_asset_job` APIs, and a host of deprecated arguments to asset-related functions, have been removed _(deprecated in `0.15.0`)_.
- The `EventMetadata` and `EventMetadataEntryData` APIs have been removed _(deprecated in `0.15.0`)_.

Deprecations

- `dagster_type_materializer` and `DagsterTypeMaterializer` have been marked experimental and will likely be removed within a 1.x release. Instead, use an `IOManager`.
- `FileManager` and `FileHandle` have been marked experimental and will likely be removed within a 1.x release.

Other Changes

- As of 1.0.0, Dagster no longer guarantees support for python 3.6. This is in line with [PEP 494](https://peps.python.org/pep-0494/), which outlines that 3.6 has reached end of life.
- **[planned]** In an upcoming 1.x release, we plan to make a change that renders values supplied to `configured` in Dagit. Up through this point, values provided to `configured` have not been sent anywhere outside the process where they were used. This change will mean that, like other places you can supply configuration, `configured` is not a good place to put secrets: **You should not include any values in configuration that you don't want to be stored in the Dagster database and displayed inside Dagit.**
- `fs_io_manager`, `s3_pickle_io_manager`, and `gcs_pickle_io_manager`, and `adls_pickle_io_manager` no longer write out a file or object when handling an output with the `None` or `Nothing` type.
- The `custom_path_fs_io_manager` has been removed, as its functionality is entirely subsumed by the `fs_io_manager`, where a custom path can be specified via config.
- The default `typing_type` of a `DagsterType` is now `typing.Any` instead of `None`.
- Dagster’s integration libraries haven’t yet achieved the same API maturity as Dagster core. For this reason, all integration libraries will remain on a pre-1.0 (0.16.x) versioning track for the time being. However, 0.16.x library releases remain fully compatible with Dagster 1.x. In the coming months, we will graduate integration libraries one-by-one to the 1.x versioning track as they achieve API maturity. If you have installs of the form:


pip install dagster=={DAGSTER_VERSION} dagster-somelibrary=={DAGSTER_VERSION}


this should be converted to:


pip install dagster=={DAGSTER_VERSION} dagster-somelibrary


to make sure the correct library version is installed.

0.15.8

Not secure
New

- Software-defined asset config schemas are no longer restricted to `dict`s.
- The `OpDefinition` constructor now accept `ins` and `outs` arguments, to make direct construction easier.
- `define_dagstermill_op` accepts `ins` and `outs` in order to make direct construction easier.

Bugfixes

- Fixed a bug where default configuration was not applied when assets were selected for materialization in Dagit.
- Fixed a bug where `RunRequests` returned from `run_status_sensors` caused the sensor to error.
- When supplying config to `define_asset_job`, an error would occur when selecting most asset subsets. This has been fixed.
- Fixed an error introduced in 0.15.7 that would prevent viewing the execution plan for a job re-execution from 0.15.0 → 0.15.6
- [dagit] The Dagit server now returns `500` http status codes for GraphQL requests that encountered an unexpected server error.
- [dagit] Fixed a bug that made it impossible to kick off materializations of partitioned asset if the `day_offset`, `hour_offset`, or `minute_offset` parameters were set on the asset’s partitions definition.
- [dagster-k8s] Fixed a bug where overriding the Kubernetes command to use to run a Dagster job by setting the `dagster-k8s/config` didn’t actually override the command.
- [dagster-datahub] Pinned version of `acryl-datahub` to avoid build error.

Breaking Changes

- The constructor of `JobDefinition` objects now accept a config argument, and the `preset_defs` argument has been removed.

Deprecations

- `DagsterPipelineRunMetadataValue` has been renamed to `DagsterRunMetadataValue`. `DagsterPipelineRunMetadataValue` will be removed in 1.0.

Community Contributions

- Thanks to hassen-io for fixing a broken link in the docs!

Documentation

- `MetadataEntry` static methods are now marked as deprecated in the docs.
- `PartitionMapping`s are now included in the API reference.
- A dbt example and memoization example using legacy APIs have been removed from the docs site.

0.15.7

Not secure
New

- `DagsterRun` now has a `job_name` property, which should be used instead of `pipeline_name`.
- `TimeWindowPartitionsDefinition` now has a `get_partition_keys_in_range` method which returns a sequence of all the partition keys between two partition keys.
- `OpExecutionContext` now has `asset_partitions_def_for_output` and `asset_partitions_def_for_input` methods.
- Dagster now errors immediately with an informative message when two `AssetsDefinition` objects with the same key are provided to the same repository.
- `build_output_context` now accepts a `partition_key` argument that can be used when testing the `handle_output` method of an IO manager.

Bugfixes

- Fixed a bug that made it impossible to load inputs using a DagsterTypeLoader if the InputDefinition had an `asset_key` set.
- Ops created with the `asset` and `multi_asset` decorators no longer have a top-level “assets” entry in their config schema. This entry was unused.
- In 0.15.6, a bug was introduced that made it impossible to load repositories if assets that had non-standard metadata attached to them were present. This has been fixed.
- [dagster-dbt] In some cases, using `load_assets_from_dbt_manifest` with a `select` parameter that included sources would result in an error. This has been fixed.
- [dagit] Fixed an error where a race condition of a sensor/schedule page load and the sensor/schedule removal caused a GraphQL exception to be raised.
- [dagit] The “Materialize” button no longer changes to “Rematerialize” in some scenarios
- [dagit] The live overlays on asset views, showing latest materialization and run info, now load faster
- [dagit] Typing whitespace into the launchpad Yaml editor no longer causes execution to fail to start
- [dagit] The explorer sidebar no longer displays “mode” label and description for jobs, since modes are deprecated.

Community Contributions

- An error will now be raised if a `repository` decorated function expects parameters. Thanks roeij!

Documentation

- The non-asset version of the Hacker News example, which lived inside `examples/hacker_news/`, has been removed, because it hadn’t received updates in a long time and had drifted from best practices. The asset version is still there and has an updated README. Check it out [here](https://github.com/dagster-io/dagster/tree/master/examples/hacker_news_assets)

0.15.6

Not secure
New

- When an exception is wrapped by another exception and raised within an op, Dagit will now display the full chain of exceptions, instead of stopping after a single exception level.
- A `default_logger_defs` argument has been added to the `repository` decorator. Check out [the docs](https://docs.dagster.io/concepts/logging/loggers#specifying-default-repository-loggers) on specifying default loggers to learn more.
- `AssetsDefinition.from_graph` and `AssetsDefinition.from_op` now both accept a `partition_mappings` argument.
- `AssetsDefinition.from_graph` and `AssetsDefinition.from_op` now both accept a `metadata_by_output_name` argument.
- `define_asset_job` now accepts an `executor_def` argument.
- Removed package pin for `gql` in `dagster-graphql`.
- You can now apply a group name to assets produced with the `multi_asset` decorator, either by supplying a `group_name` argument (which will apply to all of the output assets), or by setting the `group_name` argument on individual `AssetOut`s.
- `InputContext` and `OutputContext` now each have an `asset_partitions_def` property, which returns the `PartitionsDefinition` of the asset that’s being loaded or stored.
- `build_schedule_from_partitioned_job` now raises a more informative error when provided a non-partitioned asset job
- `PartitionMapping`, `IdentityPartitionMapping`, `AllPartitionMapping`, and `LastPartitionMapping` are exposed at the top-level `dagster` package. They're currently marked experimental.
- When a non-partitioned asset depends on a partitioned asset, you can now control which partitions of the upstream asset are used by the downstream asset, by supplying a `PartitionMapping`.
- You can now set `PartitionMappings` on `AssetIn`.
- [dagit] Made performance improvements to the loading of the partitions and backfill pages.
- [dagit] The Global Asset Graph is back by popular demand, and can be reached via a new “View global asset lineage ”link on asset group and asset catalog pages! The global graph keeps asset in the same group visually clustered together and the query bar allows you to visualize a custom slice of your asset graph.
- [dagit] Simplified the Content Security Policy and removed `frame-ancestors` restriction.
- [dagster-dbt] `load_assets_from_dbt_project` and `load_assets_from_dbt_manifest` now support a `node_info_to_group_name_fn` parameter, allowing you to customize which group Dagster will assign each dbt asset to.
- [dagster-dbt] When you supply a `runtime_metadata_fn` when loading dbt assets, this metadata is added to the default metadata that dagster-dbt generates, rather than replacing it entirely.
- [dagster-dbt] When you load dbt assets with `use_build_command=True`, seeds and snapshots will now be represented as Dagster assets. Previously, only models would be loaded as assets.

Bugfixes

- Fixed an issue where runs that were launched using the `DockerRunLauncher` would sometimes use Dagit’s Python environment as the entrypoint to launch the run, even if that environment did not exist in the container.
- Dagster no longer raises a “Duplicate definition found” error when a schedule definition targets a partitioned asset job.
- Silenced some erroneous warnings that arose when using software-defined assets.
- When returning multiple outputs as a tuple, empty list values no longer cause unexpected exceptions.
- [dagit] Fixed an issue with graph-backed assets causing a GraphQL error when graph inputs were type-annotated.
- [dagit] Fixed an issue where attempting to materialize graph-backed assets caused a graphql error.
- [dagit] Fixed an issue where partitions could not be selected when materializing partitioned assets with associated resources.
- [dagit] Attempting to materialize assets with required resources now only presents the launchpad modal if at least one resource defines a config schema.

Breaking Changes

- An op with a non-optional DynamicOutput will now error if no outputs are returned or yielded for that dynamic output.
- If an `Output` object is used to type annotate the return of an op, an Output object must be returned or an error will result.

Community Contributions

- Dagit now displays the path of the output handled by `PickledObjectS3IOManager` in run logs and Asset view. Thanks danielgafni

Documentation

- The Hacker News example now uses stable 0.15+ asset APIs, instead of the deprecated 0.14.x asset APIs.
- Fixed the build command in the instructions for contributing docs changes.
- [dagster-dbt] The dagster-dbt integration guide now contains information on using dbt with Software-Defined Assets.

0.15.5

Not secure
New

- Added documentation and helm chart configuration for threaded sensor evaluations.
- Added documentation and helm chart configuration for tick retention policies.
- Added descriptions for default config schema. Fields like execution, loggers, ops, and resources are now documented.
- UnresolvedAssetJob objects can now be passed to run status sensors.
- [dagit] A new global asset lineage view, linked from the Asset Catalog and Asset Group pages, allows you to view a graph of assets in all loaded asset groups and filter by query selector and repo.
- [dagit] A new option on Asset Lineage pages allows you to choose how many layers of the upstream / downstream graph to display.
- [dagit] Dagit's DAG view now collapses large sets of edges between the same ops for improved readability and rendering performance.

Bugfixes

- Fixed a bug with `materialize` that would cause required resources to not be applied correctly.
- Fixed issue that caused repositories to fail to load when `build_schedule_from_partitioned_job` and `define_asset_job` were used together.
- Fixed a bug that caused auto run retries to always use the `FROM_FAILURE` strategy
- Previously, it was possible to construct Software-Defined Assets from graphs whose leaf ops were not mapped to assets. This is invalid, as these ops are not required for the production of any assets, and would cause confusing behavior or errors on execution. This will now result in an error at definition time, as intended.
- Fixed issue where the run monitoring daemon could mark completed runs as failed if they transitioned quickly between STARTING and SUCCESS status.
- Fixed stability issues with the sensor daemon introduced in 0.15.3 that caused the daemon to fail heartbeat checks if the sensor evaluation took too long.
- Fixed issues with the thread pool implementation of the sensor daemon where race conditions caused the sensor to fire more frequently than the minimum interval.
- Fixed an issue with storage implementations using MySQL server version 5.6 which caused SQL syntax exceptions to surface when rendering the Instance overview pages in Dagit.
- Fixed a bug with the `default_executor_def` argument on repository where asset jobs that defined executor config would result in errors.
- Fixed a bug where an erroneous exception would be raised if an empty list was returned for a list output of an op.
- [dagit] Clicking the "Materialize" button for assets with configurable resources will now present the asset launchpad.
- [dagit] If you have an asset group and no jobs, Dagit will display it by default rather than directing you to the asset catalog.
- [dagit] DAG renderings of software-defined assets now display only the last component of the asset's key for improved readability.
- [dagit] Fixes a regression where clicking on a source asset would trigger a GraphQL error.
- [dagit] Fixed issue where the “Unloadable” section on the sensors / schedules pages in Dagit were populated erroneously with loadable sensors and schedules
- [dagster-dbt] Fixed an issue where an exception would be raised when using the dbt build command with Software-Defined Assets if a test was defined on a source.

Deprecations

- Removed the deprecated dagster-daemon health-check CLI command

Community Contributions

- TimeWindow is now exported from the dagster package (Thanks [nvinhphuc](https://github.com/nvinhphuc)!)
- Added a fix to allow customization of slack messages (Thanks [solarisa21](https://github.com/solarisa21)!)
- [dagster-databricks] The `databricks_pyspark_step_launcher` now allows you to configure the following (Thanks [Phazure](https://github.com/Phazure)!):
- the `aws_attributes` of the cluster that will be spun up for the step.
- arbitrary environment variables to be copied over to databricks from the host machine, rather than requiring these variables to be stored as secrets.
- job and cluster permissions, allowing users to view the completed runs through the databricks console, even if they’re kicked off by a service account.

Experimental

- [dagster-k8s] Added `k8s_job_op` to launch a Kubernetes Job with an arbitrary image and CLI command. This is in contrast with the `k8s_job_executor`, which runs each Dagster op in a Dagster job in its own k8s job. This op may be useful when you need to orchestrate a command that isn't a Dagster op (or isn't written in Python). Usage:

python
from dagster_k8s import k8s_job_op

my_k8s_op = k8s_job_op.configured({
"image": "busybox",
"command": ["/bin/sh", "-c"],
"args": ["echo HELLO"],
},
name="my_k8s_op",
)


- [dagster-dbt] The dbt asset-loading functions now support `partitions_def` and `partition_key_to_vars_fn` parameters, adding preliminary support for partitioned dbt assets. To learn more, check out the [Github issue](https://github.com/dagster-io/dagster/issues/7683#issuecomment-1175593637)!

0.15.4

Not secure
- Reverted sensor threadpool changes from 0.15.3 to address daemon stability issues.

Page 23 of 49

Links

Releases

Has known vulnerabilities

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.