Dstack

Latest version: v0.19.1

Safety actively analyzes 723625 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 15

0.18.44

GPU utilization policy

To avoid a waste of resources, you can now specify a minimum required GPU utilization for the run. If any GPU has utilization below threshold in all samples in a time window, the run is terminated.

yaml
type: task

utilization_policy:
min_gpu_utilization: 30
time_window: 30m

resources:
gpu: nvidia:8:24GB


In this example, if any of 8 GPUs has utilization below 30% in all samples during last 30 minutes, the run will be terminated.

DCGM metrics

`dstack` can now collect and export [NVIDIA DCGM](https://developer.nvidia.com/dcgm) metrics from running jobs on supported backends (AWS, Azure, GCP, OCI) and SSH fleets.

Metrics are disabled by default. See the [documenation](https://dstack.ai/docs/guides/server-deployment/#metrics) for how to enable and scrape them.

RunPod Community Cloud

In addition to Secure Cloud, `dstack` will now use [Community Cloud](https://docs.runpod.io/references/faq/#secure-cloud-vs-community-cloud) offers in the `runpod` backend. Community Cloud offers are usually cheaper and can be identified by a two-letter region code.


$ dstack apply -f .dstack.yml -b runpod
BACKEND REGION INSTANCE SPOT PRICE
1 runpod CA NVIDIA A100 80GB PCIe yes $0.6
2 runpod CA-MTL-3 NVIDIA A100 80GB PCIe yes $0.82


It is possible to opt out of using Community Cloud in the [backend settings](https://dstack.ai/docs/concepts/backends/#runpod).

> [!NOTE]
> If you've previously configured the `runpod` backend via the `dstack` UI, your backend settings will likely contain a fixed set of regions. Previous `dstack` versions used to add it automatically. You can remove the `regions` property to allow all regions, including two-letter Community Cloud regions.

What's Changed
* Show `inactivity_duration` in run plan in CLI by jvstme in https://github.com/dstackai/dstack/pull/2366
* Minor fixes noticed by aider by r4victor in https://github.com/dstackai/dstack/pull/2367
* Reexport DCGM metrics from instances by un-def in https://github.com/dstackai/dstack/pull/2364
* [Internal]: Update backend contributing docs by jvstme in https://github.com/dstackai/dstack/pull/2369
* Allow global admins to edit user emails via the UI by olgenn in https://github.com/dstackai/dstack/pull/2377
* Support sign-in via Microsoft EntraID for dstack Enterprise 251 by olgenn in https://github.com/dstackai/dstack/pull/2376
* Add `utilization_policy` by un-def in https://github.com/dstackai/dstack/pull/2375
* Support RunPod Community Cloud by jvstme in https://github.com/dstackai/dstack/pull/2378
* Add ORDER BY when selecting multiple rows with FOR UPDATE by r4victor in https://github.com/dstackai/dstack/pull/2379
* Allow global admins to edit user emails via the UI by olgenn in https://github.com/dstackai/dstack/pull/2381
* Support `inactivity_duration` in-place update by jvstme in https://github.com/dstackai/dstack/pull/2380
* Improve error message if pulling fails by jvstme in https://github.com/dstackai/dstack/pull/2382
* Fix `utilization_policy` in profiles by un-def in https://github.com/dstackai/dstack/pull/2385
* Set lower and upper limits of `utilization_policy.time_window` by un-def in https://github.com/dstackai/dstack/pull/2386
* Try more offers when starting a job by jvstme in https://github.com/dstackai/dstack/pull/2387


**Full Changelog**: https://github.com/dstackai/dstack/compare/0.18.43...0.18.44

0.18.43

CLI autocompletion

The `dstack` CLI now supports shell autocompletion for `bash` and `zsh`. It suggests completions for subcommands:

shell
✗ dstack s
server -- Start a server
stats -- Show run stats
stop -- Stop a run


and dynamic completions for resource names:


✗ dstack logs m
mighty-chicken-1 mighty-crab-1 my-dev --


To set up the CLI autocompletion for your shell, follow the [Installation guide](https://dstack.ai/docs/installation/#optional-cli-autocompletion).

`max_duration` set to `off` by default

The `max_duration` parameter that controls how long a run is allowed to run before stopping automatically is now set to `off` by default for all run configuration types. This means that `dstack` won't stop runs automatically unless `max_duration` is specified explicitly.

Previously, the `max_duration` defaults were `72h` for tasks, `6h` for dev environments, and `off` for services. This led to unintended runs termination and caused confusion for users unaware of `max_duration`. The new default makes `max_duration` opt-in and, thus, predictable.

If you relied on the previous `max_duration` defaults, ensure you've added `max_duration` to your run configurations.

GCP Logging for run logs

The `dstack` server requires storing run logs externally when for multi-replica server deployments. Previously, the only supported external storage was AWS CloudWatch, which limited production server deployments to AWS. Now the `dstack` server adds support for GCP Logging to store run logs. Follow the [Server deployment guide](https://dstack.ai/docs/guides/server-deployment/#gcp-logging) for more information.

Custom IAM instance profile for AWS

The AWS backend config gets the new `iam_instance_profile` parameter that allows specifying IAM instance profile that will be associated with provisioned EC2 instances. You can also specify the IAM role name for roles created via the AWS console as AWS automatically creates an instance profile and gives it the same name as the role:

yaml
projects:
- name: main
backends:
- type: aws
iam_instance_profile: dstack-test-role
creds:
type: default


This can be used to access AWS resources from runs without passing credentials explicitly.

Oracle Cloud spot instances

The `oci` backend can now provision interruptible spot instances, providing more cost-effective GPUs for workloads that can recover from interruptions.

shell
> dstack apply --gpu 1.. --spot -b oci
BACKEND REGION INSTANCE RESOURCES SPOT PRICE
1 oci eu-frankfurt-1 VM.GPU2.1 24xCPU, 72GB, 1xP100 (16GB), 50.0GB (disk) yes $0.6375
2 oci eu-frankfurt-1 VM.GPU3.1 12xCPU, 90GB, 1xV100 (16GB), 50.0GB (disk) yes $1.475
3 oci eu-frankfurt-1 VM.GPU3.2 24xCPU, 180GB, 2xV100 (16GB), 50.0GB (disk) yes $2.95


Breaking changes

* Dropped support for `python: 3.8` in run configuration.
* Set `max_duration` to `off` by default for all run configuration types.

What's Changed
* Replace pagination with lazy loading in dstack UI by olgenn in https://github.com/dstackai/dstack/pull/2309
* Dynamic CLI completion by solovyevt in https://github.com/dstackai/dstack/pull/2285
* Remove excessive project_id check for GCP by r4victor in https://github.com/dstackai/dstack/pull/2312
* [Docs] GPU blocks and proxy jump blog post (WIP) by peterschmidt85 in https://github.com/dstackai/dstack/pull/2307
* [Docs] Add `blocks` description to Concepts/Fleets by un-def in https://github.com/dstackai/dstack/pull/2308
* Replace pagination with lazy loading on Fleet list by olgenn in https://github.com/dstackai/dstack/pull/2320
* Improve GCP creds validation by r4victor in https://github.com/dstackai/dstack/pull/2322
* [UI]: Fix job details for multi-job runs by jvstme in https://github.com/dstackai/dstack/pull/2321
* Fix instance filtering by backend to use base backend by r4victor in https://github.com/dstackai/dstack/pull/2324
* [Docs]: Fix inactivity duration blog post by jvstme in https://github.com/dstackai/dstack/pull/2327
* Fix CLI instance status for instances with blocks by jvstme in https://github.com/dstackai/dstack/pull/2332
* Partially fixes openapi spec by haringsrob in https://github.com/dstackai/dstack/pull/2330
* [Bug]: UI does not show logs of distributed tasks and replicated services by olgenn in https://github.com/dstackai/dstack/pull/2334
* [Feature]: Replace pagination with lazy loading in Instances list by olgenn in https://github.com/dstackai/dstack/pull/2335
* [Feature]: Replace pagination with lazy loading in volume list by olgenn in https://github.com/dstackai/dstack/pull/2336
* [Bug]: Finished jobs included in run price by olgenn in https://github.com/dstackai/dstack/pull/2338
* Fix `DSTACK_GPUS_PER_NODE`|`DSTACK_GPUS_NUM` when blocks are used by un-def in https://github.com/dstackai/dstack/pull/2333
* Support storing run logs using GCP Logging by r4victor in https://github.com/dstackai/dstack/pull/2340
* Support OCI spot instances by jvstme in https://github.com/dstackai/dstack/pull/2337
* [Feature]: Replace pagination with lazy loading in models list by olgenn in https://github.com/dstackai/dstack/pull/2351
* [UI] Remember filter settings in local storage by olgenn in https://github.com/dstackai/dstack/pull/2352
* [Internal]: Minor tweaks in packer docs and CI by jvstme in https://github.com/dstackai/dstack/pull/2356
* Use unique names for backend resources by r4victor in https://github.com/dstackai/dstack/pull/2350
* Set max_duration to off by default for all run configurations by r4victor in https://github.com/dstackai/dstack/pull/2357
* Print message on dstack attach exit by r4victor in https://github.com/dstackai/dstack/pull/2358
* Forbid `python: 3.8` in run configurations by jvstme in https://github.com/dstackai/dstack/pull/2354
* Fix Fabric Manager in AWS/GCP/Azure/OCI OS images by jvstme in https://github.com/dstackai/dstack/pull/2355
* Install DCGM Exporter on dstack-built OS images by un-def in https://github.com/dstackai/dstack/pull/2360
* Fix volume detachment for runs started before 0.18.41 by r4victor in https://github.com/dstackai/dstack/pull/2362
* Increase Lambda provisioning timeout and refactor by jvstme in https://github.com/dstackai/dstack/pull/2353
* Bump default OS image version by jvstme in https://github.com/dstackai/dstack/pull/2363
* Support iam_instance_profile for AWS by r4victor in https://github.com/dstackai/dstack/pull/2365

New Contributors
* haringsrob made their first contribution in https://github.com/dstackai/dstack/pull/2330

**Full Changelog**: https://github.com/dstackai/dstack/compare/0.18.42...0.18.43

0.18.42

Volume attachments

It's now possible to see volume attachments when listing volumes. The `dstack volume -v` command shows which fleets the volumes are attached to in the `ATTACHED` column:


✗ dstack volume -v
NAME BACKEND REGION STATUS ATTACHED CREATED ERROR
my-gcp-volume-1 gcp europe-west4 active my-dev 1 weeks ago
(europe-west4-c)
my-aws-volume-1 aws eu-west-1 (eu-west-1a) active - 3 days ago



This can help you decide if you should use an existing volume for a run or create a new volume if all volumes are occupied.

You can also check which volumes are currently attached and which are not via the API:

python
import os
import requests

url = os.environ["DSTACK_URL"]
token = os.environ["DSTACK_TOKEN"]
project = os.environ["DSTACK_PROJECT"]

print("Getting volumes...")
resp = requests.post(
url=f"{url}/api/project/{project}/volumes/list",
headers={"Authorization": f"Bearer {token}"},
)
volumes = resp.json()

print("Checking volumes attachments...")
for volume in volumes:
is_attached = len(volume["attachments"]) > 0
print(f"Volume {volume['name']} attached: {is_attached}")



✗ python check_attachments.py
Getting volumes...
Checking volumes attachments...
Volume my-gcp-volume-1 attached: True
Volume my-aws-volume-1 attached: False


Bugfixes

This release contains several important bugfixes including a bugfix for fleets with `placement: cluster` (2302).

What's Changed
* Add Deepseek and Intel Examples by Bihan in https://github.com/dstackai/dstack/pull/2291
* Add volume attachments info to the API and CLI by r4victor in https://github.com/dstackai/dstack/pull/2298
* Fix and test offers and pool instances filtering by r4victor in https://github.com/dstackai/dstack/pull/2303


**Full Changelog**: https://github.com/dstackai/dstack/compare/0.18.41...0.18.42

0.18.41

GPU blocks

Previously, `dstack` could process only one workload per instance at a time, even if the instance had enough resources to handle multiple workloads. With a new `blocks` [fleet](https://dstack.ai/docs/reference/dstack.yml/fleet/) property you can split the instance into blocks (virtual subinstances), allowing users to run workloads simultaneously, utilizing a fraction of GPU, CPU and memory resources.

Cloud fleet

yaml
type: fleet

name: my-fleet
nodes: 1

resources:
gpu: 8:24GB

blocks: 4 split into 4 blocks, 2 GPU per block


SSH fleet

yaml
type: fleet

name: my-fleet

ssh_config:
user: ubuntu
identity_file: ~/.ssh/id_rsa
hosts:
- hostname: 3.255.177.51
blocks: auto as many as possible, e.g., 8 GPUS -> 8 blocks


You can see how many instance blocks are currently busy in the `dstack fleet` output:


$ dstack fleet
FLEET INSTANCE BACKEND RESOURCES PRICE STATUS CREATED
fleet-gaudi2 0 ssh (remote) 152xCPU, 1007GB, 8xGaudi2 (96GB), 387.0GB (disk) $0.0 3/8 busy 56 sec ago


The remaining blocks can be used for new runs.

SSH fleets with head node

With a new [`proxy_jump`](https://dstack.ai/docs/reference/dstack.yml/fleet/#proxy_jump) fleet property, `dstack` now supports network configurations where worker nodes are located behind a head node and are not reachable directly:

yaml
type: fleet
name: my-fleet

ssh_config:
user: ubuntu
identity_file: ~/.ssh/worker_node_key
hosts:
worker nodes
- 3.255.177.51
- 3.255.177.52
head node proxy; can also be configured per worker node
proxy_jump:
hostname: 3.255.177.50
user: ubuntu
identity_file: ~/.ssh/head_node_key


Check [the documentation](https://dstack.ai/docs/concepts/fleets/#head-node) for details.

Inactivity duration

You can now configure dev environments to automatically stop after a period of inactivity by specifying `inactivity_duration`:

yaml
type: dev-environment
ide: vscode
Stop if inactive for 2 hours
inactivity_duration: 2h


A dev environment is considered inactive if there are no SSH connections to it, including VS Code connections, `ssh <run name>` shells, and attached `dstack apply` or `dstack attach` commands. For more details on using `inactivity_duration`, see the [docs](https://dstack.ai/docs/concepts/dev-environments/#inactivity-duration).


Multiple EFA interfaces

`dstack` now attaches the maximum possible number of EFA interfaces when provisioning AWS instances with EFA support. For example, when provisioning p5.48xlarge instance, `dstack` configures an [optimal set up with 32 interfaces](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/efa-acc-inst-types.html#efa-for-p5) providing total network bandwidth capacity of 3,200 Gbps, of which up to 800 Gbps can be utilized for IP network traffic.

Note: Multiple EFA interface are enabled if the `aws` backend config has `public_ips: false` set. If instances have public IPs, only one EFA interface is enabled per instance due to AWS limitations.

Volumes for distributed tasks

You can now use single-attach volumes such as AWS EBS with distributed tasks by attaching different volumes to different nodes. This is done using `dstack` variable interpolation:

yaml
type: task
nodes: 8
commands:
- ...
volumes:
- name: data-volume-${{ dstack.node_rank }}
path: /volume_data


Tip: To create volumes for all nodes using one volume configuration, specify volume name with `-n`:

shell
$ for i in {0..7}; do dstack apply -f vol.dstack.yml -n data-volume-$i -y; done


Availability zones

It's now possible to specify `availability_zone` in volume configurations:

yaml
type: volume
name: my-volume
backend: aws
region: eu-west-1
availability_zone: eu-west-1c
size: 100GB


and `availability_zones` in fleet and run configurations

yaml
type: fleet
name: my-fleet
nodes: 2
availability_zones: [eu-west-1c]


This has multiple use cases:

* Specify the same availability zone when provisioning volumes and fleets to ensure they can be used together.
* Specify a volume availability zone that has instance types that you work with.
* Create volumes for all availability zones to be able to use any zone and improve GPU availability.

The ` dstack fleet -v` and `dstack volumes -v` commands now display availability zones along with regions.

Deployment considerations
* If you deploy the `dstack` server using rolling deployments (old and new replicas co-exist), it's advised to stop runs and fleets before deploying 0.18.41. Otherwise, you may see error logs from the old replica. It should not have major implications.

What's Changed
* Implement GPU `blocks` property by un-def in https://github.com/dstackai/dstack/pull/2253
* Show deleted runs in the UI by olgenn in https://github.com/dstackai/dstack/pull/2272
* [Bug]: The UI issues many API requests when stopping multiple runs by olgenn in https://github.com/dstackai/dstack/pull/2273
* Ensure frontend displays errors when getting 400 from the server by olgenn in https://github.com/dstackai/dstack/pull/2275
* Support --name for all configurations by r4victor in https://github.com/dstackai/dstack/pull/2269
* Support per-job volumes by r4victor in https://github.com/dstackai/dstack/pull/2276
* Full EFA attachment for non-public IPs by solovyevt in https://github.com/dstackai/dstack/pull/2271
* Return deleted runs in /api/runs/list by r4victor in https://github.com/dstackai/dstack/pull/2158
* Fix process_submitted_jobs instance lock by un-def in https://github.com/dstackai/dstack/pull/2279
* Change `dstack fleet` `STATUS` for block instances by un-def in https://github.com/dstackai/dstack/pull/2280
* [Docs] Restructure concept pages to ensure dstack apply is not lost at the end of the page by peterschmidt85 in https://github.com/dstackai/dstack/pull/2283
* Allow specifying Azure resource_group by r4victor in https://github.com/dstackai/dstack/pull/2288
* Allow configuring availability zones by r4victor in https://github.com/dstackai/dstack/pull/2266
* Track SSH connections in dstack-runner by jvstme in https://github.com/dstackai/dstack/pull/2287
* Add the `inactivity_timeout` configuration option by jvstme in https://github.com/dstackai/dstack/pull/2289
* Show dev environment inactivity in `dstack ps -v` by jvstme in https://github.com/dstackai/dstack/pull/2290
* Support non-root Docker images in RunPod by jvstme in https://github.com/dstackai/dstack/pull/2286
* Fix terminating runs when job is terminated by jvstme in https://github.com/dstackai/dstack/pull/2295
* [Docs]: Dev environment inactivity duration by jvstme in https://github.com/dstackai/dstack/pull/2296
* [Docs]: Add `availability_zones` to offer filters by jvstme in https://github.com/dstackai/dstack/pull/2297
* Add head node support for SSH fleets by un-def in https://github.com/dstackai/dstack/pull/2292
* Support services with head node setup by un-def in https://github.com/dstackai/dstack/pull/2299


**Full Changelog**: https://github.com/dstackai/dstack/compare/0.18.40...0.18.41

0.18.40

Volumes

Optional instance volumes

[Instance volumes](https://dstack.ai/docs/concepts/volumes/#instance-volumes) can now be made [`optional`](https://dstack.ai/docs/reference/dstack.yml/dev-environment/#instance-volumes). When a volume is marked as optional, it will be mounted only if the backend supports instance volumes; otherwise, it will not be mounted.

yaml

type: dev-environment

ide: vscode

volumes:
- instance_path: /dstack-cache
path: /root/.cache/
optional: true


Optional instance volumes are useful for caching, allowing runs to work with backends that don’t support them, such as `runpod`, `vastai`, and `kubernetes`.

Services

Path prefix

Previously, if you were running services without a gateway, it was not possible to deploy certain web apps, such as Dash. This was due to the path prefix `/proxy/services/<project name>/<run name>/` in the endpoint URL.

With this new update, it’s now possible to configure a service so that such web apps work without a gateway. To do this, set the [`strip_prefix`](https://dstack.ai/docs/reference/dstack.yml/service/#strip_prefix) property to `false` and pass the prefix to the web app. Here’s an example with a Dash app:

yaml
type: service
name: my-dash-app

gateway: false

Disable authorization
auth: false

Do not strip the path prefix
strip_prefix: false

env:
Configure Dash to work with a path prefix
- DASH_ROUTES_PATHNAME_PREFIX=/proxy/services/main/my-dash-app/

commands:
- pip install dash
- python app.py

port: 8050


Git

Branches

When you run `dstack apply`, before `dstack` starts a container, it fetches the code from the repository where `dstack apply` was invoked. If the repository is a remote Git repo, `dstack` clones it using the user’s Git credentials.

Previously, `dstack` always cloned only a single branch in this scenario (to ensure faster startup).

With this update, for development environments, `dstack` now clones all branches by default. You can override this behavior using the new [`single_branch`](https://dstack.ai/docs/reference/dstack.yml/dev-environment/#single_branch) property.

SSH

If you override the [`user`](https://dstack.ai/docs/reference/dstack.yml/dev-environment/#user) property in your run configuration, `dstack` runs the container as that user. Previously, when accessing the dev environment via VS Code or connecting to the run with the `ssh <run name>` command, you were still logged in as the root user and had to switch manually. Now, you are automatically logged in as the configured user.

What's changed

* Update contributing guide on runs and jobs by r4victor in https://github.com/dstackai/dstack/pull/2247
* Remove no offers warnings for SSH fleets by jvstme in https://github.com/dstackai/dstack/pull/2249
* Allow configuring single_branch by r4victor in https://github.com/dstackai/dstack/pull/2256
* Fix SSH fleet and gateway configuration change detection by r4victor in https://github.com/dstackai/dstack/pull/2258
* Update contributing guide on shim by un-def in https://github.com/dstackai/dstack/pull/2255
* Configuring if service path prefix is stripped by jvstme in https://github.com/dstackai/dstack/pull/2254
* [`dstack-runner`] Back up and restore `~/.ssh` files by un-def in https://github.com/dstackai/dstack/pull/2261
* Use `user` property as the default user for SSH by un-def in https://github.com/dstackai/dstack/pull/2263
* Support optional instance volumes by jvstme in https://github.com/dstackai/dstack/pull/2260
* Extend request/response logging by jvstme in https://github.com/dstackai/dstack/pull/2265

**Full changelog**: https://github.com/dstackai/dstack/compare/0.18.39...0.18.40

0.18.39

This release fixes a backward compatibility bug introduced in 0.18.38. The bug caused the CLI version 0.18.38 fail with older servers when applying fleet configurations.

What's Changed
* Handle stop_duration backward compatibility for fleet spec by r4victor in https://github.com/dstackai/dstack/pull/2243


**Full Changelog**: https://github.com/dstackai/dstack/compare/0.18.38...0.18.39

Page 2 of 15

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.