Dashclaw

Latest version: v4.21.0

Safety actively analyzes 945765 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 19

8.5.10

the patched version. `npm audit` clears to 0 vulnerabilities.

Tooling — vitest excludes `.worktrees/`

Git worktrees can hold sibling-branch copies of the test suite with their
own divergent state. Adding `.worktrees/**` to `vitest.config.js exclude`
stops the runner from inadvertently picking up tests from co-located
worktrees (a `.worktrees/codex-parity/` worktree present on the host added
73 false-positive failures to the local run before the exclude landed).

Weekly pricing-refresh workflow

`.github/workflows/refresh-model-pricing.yml` runs every Sunday at 05:00
UTC and on `workflow_dispatch`. Captures the dry-run diff for the PR body,
applies `npm run pricing:refresh:apply`, runs the pricing-adjacent test
suite (gating against regressions), and opens a PR on
`chore/pricing-refresh` via `peter-evans/create-pull-requestv6` only when
something actually changed.

One-time repo setup: Settings → Actions → General → Workflow permissions
→ Read and write + Allow GitHub Actions to create and approve pull
requests.

Dynamic model pricing — driven by LiteLLM's community JSON

`npm run pricing:refresh` now syncs `app/lib/billing.js DEFAULT_PRICING` and
`app/lib/claude-code/pricing.js PRICES_PER_MTOK` against [LiteLLM's
`model_prices_and_context_window.json`](https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json),
the de-facto industry pricing source. Anthropic / OpenAI / Google don't
publish machine-readable rates; LiteLLM is the most widely-trusted
community-maintained mirror (~50K developers, weekly updates).

- Script writes to marker-bounded blocks (`MODEL_PRICING_GENERATED:*:START/END`)
so hand-curated rows (unversioned family defaults, Codex, Llama variants
LiteLLM doesn't track) stay outside the regen path.
- Dry-run by default; `--apply` to commit. Prints a per-pattern diff so
rate changes are visible before the file write.
- Registry mapping in the script defines DashClaw-pattern → LiteLLM-key
candidates per family. First match wins; misses are logged but don't
fail the run.
- `__tests__/unit/refresh-model-pricing.test.js` locks in: per-million
conversion, multi-candidate fallback, placeholder-entry skip, no-cache-
columns handling, REGISTRY coverage, and the marker-replace contract.

Applied the first refresh; the live diff vs. the prior hand-maintained
table surfaced these real provider updates:

- **o3: input \$10 → \$2, output \$40 → \$8** (OpenAI's mid-2025 price cut).
- **o3-pro: input \$150 → \$20, output \$600 → \$80** (same cut).
- **GPT-4o / GPT-4o-mini / GPT-4.1 family: cache_read rates added**
(previously \$0 — we were under-counting cache-heavy spend for those
models the same way we did for opus-4-6).
- **o3-mini / o4-mini: cache_read rates added** (\$0.55 / \$0.275).
- **Gemini 2.5 Flash: input \$0.15 → \$0.30, output \$0.60 → \$2.50**,
cache_read added at \$0.03.
- **Gemini 2.5 Pro: cache_read added at \$0.125**.

The next operator can re-run `npm run pricing:refresh` weekly (manually or
via a GitHub Action — workflow scaffolding is straightforward but not
included in this commit) to keep the table fresh.

Pricing accuracy fix — Claude 4.5/4.6/4.7 family

Pre-LiteLLM-integration cleanup of the same root cause that drove the
Code Sessions vs Mission Control 6× cost divergence (see below). Both
pricing tables carried legacy Opus 4.1 rates (\$15/\$75) for every Opus
4-x — Anthropic dropped Opus 4.5/4.6/4.7 to \$5/\$25 (with \$6.25 cache
write, \$0.50 cache read). Sonnet 4.5 and Haiku 4.5 cache columns were
also missing; Haiku 4.5 input/output had \$0.80/\$4 (Anthropic publishes
\$1/\$5). All corrected to match
[platform.claude.com/docs/en/about-claude/pricing](https://platform.claude.com/docs/en/about-claude/pricing).

`scripts/backfill-code-session-cache-cost.mjs` is the path to recompute
historical `cost_usd` against the corrected rates — opt-in, dry-run by
default. The detail-page divergence flag now points operators at the
script.

Bugfix — backfill script needed env loading

`scripts/backfill-code-session-cache-cost.mjs` silently returned 0 rows
when `DATABASE_URL` was unset (mock driver fallback). Switched to the
sibling-script pattern: `import './_load-env.mjs'` + `createSqlFromEnv()`
auto-loads `.env.local` and errors out with a clear message when the
env is missing.

4.21.0

One-command local install: `npx dashclaw up`.

Added
- **`npx dashclaw up` — one command from nothing to a running, governed local DashClaw.** Installs the app to `~/.dashclaw`, provisions Postgres (Docker if present → embedded Postgres, no accounts → paste a `postgresql://` URL), generates secrets, mints the admin API key, applies migrations, builds, starts on `:3000`, and offers to wire Claude Code hooks. Re-running boots an existing install; `npx dashclaw down` stops it; `--update` upgrades; failures checkpoint and resume. Flags: `--yes --no-browser --db docker|embedded|url --dir --port --source-dir --update`.
- **SDK bin shim:** the `dashclaw` npm package now exposes a `dashclaw` bin that forwards `npx dashclaw <args>` to `dashclaw/cli` (`dashclaw/cli` 0.5.0 carries the `up`/`down` commands).
- **`scripts/setup.mjs` non-interactive mode** (`--yes --database-url --json --skip-install --skip-build`) so the installer can drive it as a child process with a single-line JSON contract on stdout.
- **3-OS CI smoke** (`.github/workflows/up-smoke.yml`): end-to-end `up` against embedded Postgres on ubuntu/windows/macos.

Fixed
- `up` installer hardening from review + a real end-to-end run: stdin-`ignore` so a stray prompt can't hang setup; `--database-url` overrides a stale `.env.local`; non-interactive setup fails loudly on an unreachable DB; `--json` never echoes a pre-existing admin password; url-mode reuses the saved DB URL on resume instead of re-prompting; boot detects a live server and reuses it instead of spawning a duplicate; setup skips its redundant install+build when driven by `up`; Windows docker-filter caret + embedded-init cleanup + tarball-failure cleanup.

4.20.2

Security + reliability hardening from an adversarial review and a security pass. Platform-only — no SDK source change, so the Node + Python SDKs are intentionally not republished at this version.

Security
- **Org kill-switch (halt) can no longer be bypassed by the idempotency replay (CRITICAL).** A halted org's retried action carrying a matching `idempotency_key` was served its cached pre-halt decision (allow/warn/require_approval) for up to the 10-minute replay window — and with `?record=true` recorded as running. Halt is now read before the replay short-circuit (new `getOrgHaltState`, sharing the cached settings read + eager invalidation), so every evaluation under a halt blocks as documented.
- **`/api/webhooks/stripe` is reachable for Stripe's unauthenticated signed POST** — added to the public routes so billing events stop 401ing before signature verification (dormant until `STRIPE_SECRET_KEY` is set, but would have silently desynced billing from entitlements).
- **Public-route matching is boundary-aware** (`pathname === route || startsWith(route + '/')`) so a future sibling of a public prefix (e.g. `/api/cron-report`) cannot ship unauthenticated — the foot-gun that once exposed the whole `/api/prompts` surface.
- **Local admin login is brute-force resistant** — a DB-backed per-target failure counter locks the login after repeated failures (previously only the per-instance in-memory rate limiter), fail-open so a broken store can't lock the operator out.
- **CLI and MCP client warn on a plaintext-`http` base URL** to a non-local host, where the API key would travel unencrypted.

Fixed
- **Context-menu governance actions surface server failures instead of silently succeeding (MAJOR).** Site-wide right-click Approve/Deny/Delete/Revoke checked no response status and refreshed unconditionally, so a 401/403/500 looked like success; they now throw on `!res.ok` and surface the failure, matching the hardened approvals page.
- **Vercel preview deployments build again** — `auto-migrate` skips on a non-production build with no `DATABASE_URL` (the preview environment has none) instead of hard-failing, while a production build missing it still fails loudly. Stops failed-preview emails on every Dependabot PR.
- **HITL approvals are honored on guard re-evaluation**, and the hook text scorer is calibrated.
- **`node -e` / `python -c` are no longer blocked by accident** — the bash classifier gained an interpreter intent so inline eval lands in the warn band instead of inheriting the worst-case unknown-command risk that pushed it into the block band.
- Repo hygiene: the marketing "Run live demo" button is wired to the live-demo anchor; stale gate logs and one-off reports were cleaned up; the 32 MB marketing video was untracked.

4.20.1

Launch-readiness patch (Show HN prep): MCP read-path fixes + doc hygiene.

Fixed
- **MCP read tools honor explicit `agent_id` filters** (`dashclaw/mcp-server` 2.0.1): on the 8 query tools (`loop_list`, `learning_query`, `decisions_recent`, `handoff_latest`, `secret_list`, `secret_due`, `inbox_list`, `behavior_suggestions`) the server-configured agent id no longer silently rewrites an explicit per-call filter — "show me agent X's loops" used to return the caller's own rows. Write tools keep server-priority identity pinning (impersonation guard, unchanged) and their tool descriptions now say so instead of promising an override.
- `GET /api/actions/loops` actually filters by `action_id` — the MCP `loop_list` tool has always advertised and sent the param, but the route silently ignored it and returned every loop.
- README: dropped the stale `(v2.13.3)` version label from the Durable execution finality section (platform versions are 4.x; the label read as the current release).
- Removed `docs/homepage-draft-claude-code.md` — a superseded Phase-3 homepage draft whose maintainer checklist (unpublished screencast URL) was visible in the public repo.

Notes
- The May 2026 smoke-test reports of `loop_list`/`learning_query` returning 500s were re-verified against the live instance: both were fixed by the earlier loops-route join fix and are healthy; the agent_id filter rewrite above was the remaining real defect.
- Republish owed: npm `dashclaw/mcp-server` 2.0.1, plus the SDK 4.20.x republish carried over from 4.20.0 (registries last at 4.11.0).

4.20.0

Guard Enforcement Contract (Organ 3 / One-System program, Phase 1): the trust spine now fails closed. Full reference: `docs/guard-enforcement-contract.md`.

Added
- **Evaluation deadline** — guard policy evaluation is bounded (default 3500ms, `DASHCLAW_GUARD_DEADLINE_MS`); on overrun a degraded decision is built from accumulated state (never downgrading an already-found block), still persisted through the audit gate, with recovery marked partial. The hooks' 5s/zero-retry HTTP budget can no longer be bricked by a slow webhook or LLM phase.
- **Org kill switch** — `POST/GET /api/halt` (admin-only, both transitions audited via activity_logs) + `dashclaw halt on|off|status [--reason]`. While halted, every guard evaluation for the org returns an immediate audited block across hook/MCP/SDK/API; eager cache invalidation makes it effective on the very next call (no 30s TTL lag); the halt read piggybacks the existing hot-path settings query.
- **End-to-end idempotency** — every auto-retrying client derives an idempotency key (one convention, reference `sdk/dashclaw.js deriveIdempotencyKey`, pinned by cross-language golden vectors): hooks key on `tool_use_id`, MCP/SDKs on content + hour bucket; SDK `createAction` auto-derives when the caller didn't supply one (explicit key wins). `/api/guard` accepts `idempotency_key`; `?record=true` short-circuits on the existing action row; a duplicate guard call inside a 10-minute window replays the prior decision (`idempotent_replay: true`) and writes NO new guard_decisions row, keeping approval-flood/signal/digest counts honest.
- MCP guard context enrichment toward hook parity: optional `target`, `write_paths`, `content` (capped 20k), `tool_name` inputs let protected-path, secret-scan, and content policies fire on MCP-originated calls.
- `docs/guard-enforcement-contract.md` — degradation precedence, deadline, cross-surface unavailable policy, idempotency derivation, kill switch.

Changed
- **Fail-closed degradation defaults** — webhook `on_timeout` and semantic-check `fallback` defaults flipped from `allow` to the global contract: per-policy override → `DASHCLAW_GUARD_FALLBACK` → `require_approval`. `DASHCLAW_GUARD_FALLBACK=allow` is the explicit self-hoster escape hatch; the env enum now accepts `require_approval`. Policy-builder UI defaults flipped to match (existing policies with explicit values are untouched).
- **MCP fail-closed mapping** — `dashclaw_guard` maps transport errors / non-2xx / malformed responses to an explicit fail-closed result governed by `DASHCLAW_GUARD_UNAVAILABLE_POLICY` (default `block`, same env name + default as the Python hooks); `dashclaw_record` fails loud ("NOT written to the audit ledger") instead of returning a raw error blob.
- Hook HTTP retries are transient-only: non-transient 4xx fail immediately (408/429/5xx still retry); the AUTH_FAILED sentinel is preserved.

Fixed
- Livingcode mirror pipeline: plugin hook mirrors (`plugins/dashclaw/hooks/*.py` + `dashclaw_agent_intel/`) and the platform-intelligence skill mirrors are now auto-staged into the SAME commit as their canonical source (previously they landed in follow-up sync commits); `dashclaw_session_digest.py` added to the living-merge post-merge regen manifest.

4.19.1

Docs/media patch.

Added
- README overhaul: a Remotion-rendered governance-loop animation (intent → guard → approve → record, in the product's token palette) plus a "control plane, running" tour with live screenshots of the Decisions Ledger, Mission Control, Analytics, and Governance Posture. Animation source lives in `media/remotion/` (standalone subproject, not part of the platform dependency tree); render with `npm run render:gif`.

Page 1 of 19

© 2026 Safety CLI Cybersecurity Inc. All Rights Reserved.