Dagster

Latest version: v1.8.7

Safety actively analyzes 663899 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 45 of 49

0.7.11

Not secure
**Bugfix**

- Fixed an issue with strict snapshot ID matching when loading historical snapshots, which caused
errors on the Runs page when viewing historical runs.
- Fixed an issue where `dagster_celery` had introduced a spurious dependency on `dagster_k8s`
(2435)
- Fixed an issue where our Airflow, Celery, and Dask integrations required S3 or GCS storage and
prevented use of filesystem storage. Filesystem storage is now also permitted, to enable use of
these integrations with distributed filesystems like NFS (2436).

0.7.10

Not secure
**New**

- `RepositoryDefinition` now takes `schedule_defs` and `partition_set_defs` directly. The loading
scheme for these definitions via `repository.yaml` under the `scheduler:` and `partitions:` keys
is deprecated and expected to be removed in 0.8.0.
- Mark published modules as python 3.8 compatible.
- The dagster-airflow package supports loading all Airflow DAGs within a directory path, file path,
or Airflow DagBag.
- The dagster-airflow package supports loading all 23 DAGs in Airflow example_dags folder and
execution of 17 of them (see: `make_dagster_repo_from_airflow_example_dags`).
- The dagster-celery CLI tools now allow you to pass additional arguments through to the underlying
celery CLI, e.g., running `dagster-celery worker start -n my-worker -- --uid=42` will pass the
`--uid` flag to celery.
- It is now possible to create a `PresetDefinition` that has no environment defined.
- Added `dagster schedule debug` command to help debug scheduler state.
- The `SystemCronScheduler` now verifies that a cron job has been successfully been added to the
crontab when turning a schedule on, and shows an error message if unsuccessful.

**Breaking Changes**

- A `dagster instance migrate` is required for this release to support the new experimental assets
view.
- Runs created prior to 0.7.8 will no longer render their execution plans as DAGs. We are only
rendering execution plans that have been persisted. Logs are still available.
- `Path` is no longer valid in config schemas. Use `str` or `dagster.String` instead.
- Removed the `pyspark_solid` decorator - its functionality, which was experimental, is subsumed by
requiring a StepLauncher resource (e.g. emr_pyspark_step_launcher) on the solid.

**Dagit**

- Merged "re-execute", "single-step re-execute", "resume/retry" buttons into one "re-execute" button
with three dropdown selections on the Run page.

**Experimental**

- Added new `asset_key` string parameter to Materializations and created a new “Assets” tab in Dagit
to view pipelines and runs associated with these keys. The API and UI of these asset-based are
likely to change, but feedback is welcome and will be used to inform these changes.
- Added an `emr_pyspark_step_launcher` that enables launching PySpark solids in EMR. The
"simple_pyspark" example demonstrates how it’s used.

**Bugfix**

- Fixed an issue when running Jupyter notebooks in a Python 2 kernel through dagstermill with
Dagster running in Python 3.
- Improved error messages produced when dagstermill spins up an in-notebook context.
- Fixed an issue with retrieving step events from `CompositeSolidResult` objects.

0.7.9

Not secure
**Breaking Changes**

- If you are launching runs using `DagsterInstance.launch_run`, this method now takes a run id
instead of an instance of `PipelineRun`. Additionally, `DagsterInstance.create_run` and
`DagsterInstance.create_empty_run` have been replaced by `DagsterInstance.get_or_create_run` and
`DagsterInstance.create_run_for_pipeline`.
- If you have implemented your own `RunLauncher`, there are two required changes:
- `RunLauncher.launch_run` takes a pipeline run that has already been created. You should remove
any calls to `instance.create_run` in this method.
- Instead of calling `startPipelineExecution` (defined in the
`dagster_graphql.client.query.START_PIPELINE_EXECUTION_MUTATION`) in the run launcher, you
should call `startPipelineExecutionForCreatedRun` (defined in
`dagster_graphql.client.query.START_PIPELINE_EXECUTION_FOR_CREATED_RUN_MUTATION`).
- Refer to the `RemoteDagitRunLauncher` for an example implementation.

**New**

- Improvements to preset and solid subselection in the playground. An inline preview of the pipeline
instead of a modal when doing subselection, and the correct subselection is chosen when selecting
a preset.
- Improvements to the log searching. Tokenization and autocompletion for searching messages types
and for specific steps.
- You can now view the structure of pipelines from historical runs, even if that pipeline no longer
exists in the loaded repository or has changed structure.
- Historical execution plans are now viewable, even if the pipeline has changed structure.
- Added metadata link to raw compute logs for all StepStart events in PipelineRun view and Step
view.
- Improved error handling for the scheduler. If a scheduled run has config errors, the errors are
persisted to the event log for the run and can be viewed in Dagit.

**Bugfix**

- No longer manually dispose sqlalchemy engine in dagster-postgres
- Made boto3 dependency in dagster-aws more flexible (2418)
- Fixed tooltip UI cleanup in partitioned schedule view

**Documentation**

- Brand new documentation site, available at https://docs.dagster.io
- The tutorial has been restructured to multiple sections, and the examples in intro_tutorial have
been rearranged to separate folders to reflect this.

0.7.8

Not secure
**Breaking Changes**

- The `execute_pipeline_with_mode` and `execute_pipeline_with_preset` APIs have been dropped in
favor of new top level arguments to `execute_pipeline`, `mode` and `preset`.
- The use of `RunConfig` to pass options to `execute_pipeline` has been deprecated, and `RunConfig`
will be removed in 0.8.0.
- The `execute_solid_within_pipeline` and `execute_solids_within_pipeline` APIs, intended to support
tests, now take new top level arguments `mode` and `preset`.

**New**

- The dagster-aws Redshift resource now supports providing an error callback to debug failed
queries.
- We now persist serialized execution plans for historical runs. They will render correctly even if
the pipeline structure has changed or if it does not exist in the current loaded repository.
- Clicking on a pipeline tag in the `Runs` view will apply that tag as a filter.

**Bugfix**

- Fixed a bug where telemetry logger would create a log file (but not write any logs) even when
telemetry was disabled.

**Experimental**

- The dagster-airflow package supports ingesting Airflow dags and running them as dagster pipelines
(see: `make_dagster_pipeline_from_airflow_dag`). This is in the early experimentation phase.
- Improved the layout of the experimental partition runs table on the `Schedules` detailed view.

**Documentation**

- Fixed a grammatical error (Thanks flowersw!)

0.7.7

Not secure
**Breaking Changes**

- The default sqlite and `dagster-postgres` implementations have been altered to extract the
event `step_key` field as a column, to enable faster per-step queries. You will need to run
`dagster instance migrate` to update the schema. You may optionally migrate your historical event
log data to extract the `step_key` using the `migrate_event_log_data` function. This will ensure
that your historical event log data will be captured in future step-key based views. This
`event_log` data migration can be invoked as follows:

python
from dagster.core.storage.event_log.migration import migrate_event_log_data
from dagster import DagsterInstance

migrate_event_log_data(instance=DagsterInstance.get())


- We have made pipeline metadata serializable and persist that along with run information.
While there are no user-facing features to leverage this yet, it does require an instance
migration. Run `dagster instance migrate`. If you have already run the migration for the
`event_log` changes above, you do not need to run it again. Any unforeseen errors related to the
new `snapshot_id` in the `runs` table or the new `snapshots` table are related to this migration.
- dagster-pandas `ColumnTypeConstraint` has been removed in favor of `ColumnDTypeFnConstraint` and
`ColumnDTypeInSetConstraint`.

**New**

- You can now specify that dagstermill output notebooks be yielded as an output from dagstermill
solids, in addition to being materialized.
- You may now set the extension on files created using the `FileManager` machinery.
- dagster-pandas typed `PandasColumn` constructors now support pandas 1.0 dtypes.
- The Dagit Playground has been restructured to make the relationship between Preset, Partition
Sets, Modes, and subsets more clear. All of these buttons have be reconciled and moved to the
left side of the Playground.
- Config sections that are required but not filled out in the Dagit playground are now detected
and labeled in orange.
- dagster-celery config now support using `env:` to load from environment variables.

**Bugfix**

- Fixed a bug where selecting a preset in `dagit` would not populate tags specified on the pipeline
definition.
- Fixed a bug where metadata attached to a raised `Failure` was not displayed in the error modal in
`dagit`.
- Fixed an issue where reimporting dagstermill and calling `dagstermill.get_context()` outside of
the parameters cell of a dagstermill notebook could lead to unexpected behavior.
- Fixed an issue with connection pooling in dagster-postgres, improving responsiveness when using
the Postgres-backed storages.

**Experimental**

- Added a longitudinal view of runs for on the `Schedule` tab for scheduled, partitioned pipelines.
Includes views of run status, execution time, and materializations across partitions. The UI is
in flux and is currently optimized for daily schedules, but feedback is welcome.

0.7.6

Not secure
**Breaking Changes**

- `default_value` in `Field` no longer accepts native instances of python enums. Instead
the underlying string representation in the config system must be used.
- `default_value` in `Field` no longer accepts callables.
- The `dagster_aws` imports have been reorganized; you should now import resources from
`dagster_aws.<AWS service name>`. `dagster_aws` provides `s3`, `emr`, `redshift`, and `cloudwatch`
modules.
- The `dagster_aws` S3 resource no longer attempts to model the underlying boto3 API, and you can
now just use any boto3 S3 API directly on a S3 resource, e.g.
`context.resources.s3.list_objects_v2`. (2292)

**New**

- New `Playground` view in `dagit` showing an interactive config map
- Improved storage and UI for showing schedule attempts
- Added the ability to set default values in `InputDefinition`
- Added CLI command `dagster pipeline launch` to launch runs using a configured `RunLauncher`
- Added ability to specify pipeline run tags using the CLI
- Added a `pdb` utility to `SolidExecutionContext` to help with debugging, available within a solid
as `context.pdb`
- Added `PresetDefinition.with_additional_config` to allow for config overrides
- Added resource name to log messages generated during resource initialization
- Added grouping tags for runs that have been retried / reexecuted.

**Bugfix**

- Fixed a bug where date range partitions with a specified end date was clipping the last day
- Fixed an issue where some schedule attempts that failed to start would be marked running forever.
- Fixed the `weekly` partitioned schedule decorator
- Fixed timezone inconsistencies between the runs view and the schedules view
- Integers are now accepted as valid values for Float config fields
- Fixed an issue when executing dagstermill solids with config that contained quote characters.

**dagstermill**

- The Jupyter kernel to use may now be specified when creating dagster notebooks with the `--kernel`
flag.

**dagster-dbt**

- `dbt_solid` now has a `Nothing` input to allow for sequencing

**dagster-k8s**

- Added `get_celery_engine_config` to select celery engine, leveraging Celery infrastructure

**Documentation**

- Improvements to the airline and bay bikes demos
- Improvements to our dask deployment docs (Thanks jswaney!!)

Page 45 of 49

Links

Releases

Has known vulnerabilities

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.