Dagster

Latest version: v1.10.7

Safety actively analyzes 723685 Python packages for vulnerabilities to keep your Python projects secure.

Page 26 of 54

1.0.11

Not secure

New

- `RepositoryDefinition` now exposes a `load_asset_value` method, which accepts an asset key and invokes the asset’s I/O manager’s `load_input` function to load the asset as a Python object. This can be used in notebooks to do exploratory data analysis on assets.
- Methods to fetch a list of partition keys from an input/output `PartitionKeyRange` now exist on the op execution context and input/output context.
- [dagit] On the Instance Overview page, batched runs in the run timeline view will now proportionally reflect the status of the runs in the batch instead of reducing all run statuses to a single color.
- [dagster-dbt] [dagster-snowflake] You can now use the Snowflake IO manager with dbt assets, which allows them to be loaded from Snowflake into Pandas DataFrames in downstream steps.
- The dagster package’s pin of the alembic package is now much less restrictive.

Bugfixes

- The sensor daemon when using threads will no longer evaluate the next tick for a sensor if the previous one is still in flight. This resolves a memory leak in the daemon process.
- The scheduler will no longer remove tracked state for automatically running schedules when they are absent due to a workspace load error.
- The way user code severs manage repository definitions has been changed to more efficiently serve requests.
- The `multi_asset` decorator now respects its `config_schema` parameter.
- [dagit] Config supplied to `define_asset_job` is now prefilled in the modal that pops up when you click the Materialize button on an asset job page, so you can quickly adjust the defaults.
- [dagster-dbt] Previously, `DagsterDbtCliError`s produced from the dagster-dbt library would contain large serialized objects representing the raw unparsed logs from the relevant cli command. Now, these messages will contain only the parsed version of these messages.
- Fixed an issue where the `deploy_ecs` example didn’t work when built and deployed on an M1 Mac.

Community Contributions

- [dagster-fivetran] The `resync_parameters` configuration on the `fivetran_resync_op` is now optional, enabling triggering historical re\*syncs for connectors. Thanks dwallace0723!

Documentation

- Improved API documentation for the Snowflake resource.

1.0.10

Not secure

New

- Run status sensors can now monitor all runs in a Dagster Instance, rather than just runs from jobs within a single repository. You can enable this behavior by setting `monitor_all_repositories=True` in the run status sensor decorator.
- The `run_key` argument on `RunRequest` and `run_request_for_partition` is now optional.
- [dagster-databricks] A new “verbose_logs” config option on the databricks_pyspark_step_launcher makes it possible to silence non-critical logs from your external steps, which can be helpful for long-running, or highly parallel operations (thanks zyd14!)
- [dagit] It is now possible to delete a run in Dagit directly from the run page. The option is available in the dropdown menu on the top right of the page.
- [dagit] The run timeline on the Workspace Overview page in Dagit now includes ad hoc asset materialization runs.

Bugfixes

- Fixed a set of bugs in `multi_asset_sensor` where the cursor would fail to update, and materializations would be returned out of order for `latest_materialization_records_by_partition`.
- Fixed a bug that caused failures in runs with time-partitioned asset dependencies when the PartitionsDefinition had an offset that wasn’t included in the date format. E.g. a daily-partitioned asset with an hour offset, whose date format was `%Y-%m-%d`.
- An issue causing code loaded by file path to import repeatedly has been resolved.
- To align with best practices, singleton comparisons throughout the codebase have been converted from (e.g.) `foo == None` to `foo is None` (thanks chrisRedwine!).
- [dagit] In backfill jobs, the “Partition Set” column would sometimes show an internal `__ASSET_JOB` name, rather than a comprehensible set of asset keys. This has been fixed.
- [dagit] It is now possible to collapse all Asset Observation rows on the AssetDetails page.
- [dagster-dbt] Fixed issue that would cause an error when loading assets from dbt projects in which a source had a “\*” character in its name (e.g. BigQuery sharded tables)
- [dagster-k8s] Fixed an issue where the `k8s_job_op` would sometimes fail if the Kubernetes job that it creates takes a long time to create a pod.
- Fixed an issue where links to the compute logs for a run would sometimes fail to load.
- [dagster-k8s] The `k8s_job_executor` now uses environment variables in place of CLI arguments to avoid limits on argument size with large dynamic jobs.

Documentation

- Docs added to explain subsetting graph-backed assets. You can use this feature following the documentation [here](https://docs.dagster.io/concepts/assets/graph-backed-assets).
- UI updated to reflect separate version schemes for mature core Dagster packages and less mature integration libraries

1.0.9

Not secure

New

- The `multi_asset_sensor` (experimental) now has improved capabilities to monitor asset partitions via a `latest_materialization_records_by_partition` method.
- Performance improvements for the Partitions page in Dagit.

Bugfixes

- Fixed a bug that caused the op_config argument of `dagstermill.get_context` to be ignored
- Fixed a bug that caused errors when loading the asset details page for assets with time window partitions definitions
- Fixed a bug where assets sometimes didn’t appear in the Asset Catalog while in Folder view.
- [dagit] Opening the asset lineage tab no longer scrolls the page header off screen in some scenarios
- [dagit] The asset lineage tab no longer attempts to materialize source assets included in the upstream / downstream views.
- [dagit] The Instance page Run Timeline no longer commingles runs with the same job name in different repositories
- [dagit] Emitting materializations with JSON metadata that cannot be parsed as JSON no longer crashes the run details page
- [dagit] Viewing the assets related to a run no longer shows the same assets multiple times in some scenarios
- [dagster-k8s] Fixed a bug with timeouts causing errors in `k8s_job_op`
- [dagster-docker] Fixed a bug with Op retries causing errors with the `docker_executor`

Community Contributions

- [dagster-aws] Thanks Vivanov98 for adding the `list_objects` method to `S3FakeSession`!

Experimental

- [dagster-airbyte] Added an experimental function to automatically generate Airbyte assets from project YAML files. For more information, see the [dagster-airbyte docs](https://docs.dagster.io/_apidocs/libraries/dagster-airbyte).
- [dagster-airbyte] Added the forward_logs option to `AirbyteResource`, allowing users to disble forwarding of Airbyte logs to the compute log, which can be expensive for long-running syncs.
- [dagster-airbyte] Added the ability to generate Airbyte assets for [basic normalization](https://docs.airbyte.com/understanding-airbyte/basic-normalization/#nesting) tables generated as part of a sync.

Documentation

- [dagster-dbt] Added a new guide focused on the dbt Cloud integration.
- Fixed a bug that was hiding display of some public methods in the API docs.
- Added documentation for [managing full deployments in Dagster Cloud](https://docs.dagster.io/dagster-cloud/managing-deployments/managing-deployments), including a [reference for deployment configuration options](https://docs.dagster.io/dagster-cloud/developing-testing/deployment-settings-reference).

1.0.8

Not secure

New

- With the new `cron_schedule` argument to `TimeWindowPartitionsDefinition`, you can now supply arbitrary cron expressions to define time window-based partition sets.
- Graph-backed assets can now be subsetted for execution via `AssetsDefinition.from_graph(my_graph, can_subset=True)`.
- `RunsFilter` is now exported in the public API.
- [dagster-k8s] The `dagster-user-deployments.deployments[].schedulerName` Helm value for specifying custom Kubernetes schedulers will now also apply to run and step workers launched for the given user deployment. Previously it would only apply to the grpc server.

Bugfixes

- In some situations, default asset config was ignored when a subset of assets were selected for execution. This has been fixed.
- Added a pin to `grpcio` in dagster to address an issue with the recent 0.48.1 grpcio release that was sometimes causing Dagster code servers to hang.
- Fixed an issue where the “Latest run” column on the Instance Status page sometimes displayed an older run instead of the most recent run.

Community Contributions

- In addition to a single cron string, `cron_schedule` now also accepts a sequence of cron strings. If a sequence is provided, the schedule will run for the union of all execution times for the provided cron strings, e.g., `['45 23 * * 6', '30 9 * * 0]` for a schedule that runs at 11:45 PM every Saturday and 9:30 AM every Sunday. Thanks erinov1!
- Added an optional boolean config `install_default_libraries` to `databricks_pyspark_step_launcher` . It allows to run Databricks jobs without installing the default Dagster libraries .Thanks nvinhphuc!

Experimental

- [dagster-k8s] Added additional configuration fields (`container_config`, `pod_template_spec_metadata`, `pod_spec_config`, `job_metadata`, and `job_spec_config`) to the experimental `k8s_job_op` that can be used to add additional configuration to the Kubernetes pod that is launched within the op.

1.0.7

Not secure

New

- Several updates to the Dagit run timeline view: your time window preference will now be preserved locally, there is a clearer “Now” label to delineate the current time, and upcoming scheduled ticks will no longer be batched with existing runs.
- [dagster-k8s] `ingress.labels` is now available in the Helm chart. Any provided labels are appended to the default labels on each object (`helm.sh/chart`, `app.kubernetes.io/version`, and `app.kubernetes.io/managed-by`).
- [dagster-dbt] Added support for two types of dbt nodes: metrics, and ephemeral models.
- When constructing a `GraphDefinition` manually, InputMapping and OutputMapping objects should be directly constructed.

Bugfixes

- [dagster-snowflake] Pandas is no longer imported when `dagster_snowflake` is imported. Instead, it’s only imported when using functionality inside `dagster-snowflake` that depends on pandas.
- Recent changes to `run_status_sensors` caused sensors that only monitored jobs in external repositories to also monitor all jobs in the current repository. This has been fixed.
- Fixed an issue where "unhashable type" errors could be spawned from sensor executions.
- [dagit] Clicking between assets in different repositories from asset groups and asset jobs now works as expected.
- [dagit] The DAG rendering of composite ops with more than one input/output mapping has been fixed.
- [dagit] Selecting a source asset in Dagit no longer produces a GraphQL error
- [dagit] Viewing “Related Assets” for an asset run now shows the full set of assets included in the run, regardless of whether they were materialized successfully.
- [dagit] The Asset Lineage view has been simplified and lets you know if the view is being clipped and more distant upstream/downstream assets exist.
- Fixed erroneous experimental warnings being thrown when using `with_resources` alongside source assets.

Breaking Changes

- [dagit] The launchpad tab is no longer shown for Asset jobs. Asset jobs can be launched via the “Materialize All” button shown on the Overview tab. To provide optional configuration, hold shift when clicking “Materialize”.
- The arguments to `InputMapping` and `OutputMapping` APIs have changed.

Community Contributions

- The `ssh_resource` can now accept configuration from environment variables. Thanks [cbini](https://github.com/cbini)!
- Spelling corrections in `migrations.md`. Thanks [gogi2811](https://github.com/gogi2811)!

1.0.6

Not secure

New

- [dagit] nbconvert is now installed as an extra in Dagit.
- Multiple assets can be monitored for materialization using the `multi_asset_sensor` (experimental).
- Run status sensors can now monitor jobs in external repositories.
- The `config` argument of `define_asset_job` now works if the job contains partitioned assets.
- When configuring sqlite-based storages in dagster.yaml, you can now point to environment variables.
- When emitting `RunRequests` from sensors, you can now optionally supply an `asset_selection` argument, which accepts a list of `AssetKey`s to materialize from the larger job.
- [dagster-dbt] `load_assets_from_dbt_project` and `load_assets_from_dbt_manifest` now support the `exclude` parameter, allowing you to more precisely which resources to load from your dbt project (thanks flvndh!)
- [dagster-k8s] `schedulerName` is now available for all deployments in the Helm chart for users who use a custom Kubernetes scheduler

Bugfixes

- Previously, types for multi-assets would display incorrectly in Dagit when specified. This has been fixed.
- In some circumstances, viewing nested asset paths in Dagit could lead to unexpected empty states. This was due to incorrect slicing of the asset list, and has been fixed.
- Fixed an issue in Dagit where the dialog used to wipe materializations displayed broken text for assets with long paths.
- [dagit] Fixed the Job page to change the latest run tag and the related assets to bucket repository-specific jobs. Previously, runs from jobs with the same name in different repositories would be intermingled.
- Previously, if you launched a backfill for a subset of a multi-asset (e.g. dbt assets), all assets would be executed on each run, instead of just the selected ones. This has been fixed.
- [dagster-dbt] Previously, if you configured a `select` parameter on your `dbt_cli_resource` , this would not get passed into the corresponding invocations of certain `context.resources.dbt.x()` commands. This has been fixed.

Page 26 of 54

Releases

Has known vulnerabilities

Previous Next

Dagster

Page 26 of 54

1.0.11

1.0.10

1.0.9

1.0.8

1.0.7

1.0.6

Page 26 of 54

Links

Releases