Skip to content

Confirm before destructive Galaxy history delete/purge (#338)#344

Merged
dannon merged 2 commits into
galaxyproject:mainfrom
dannon:fix/338-destructive-delete-confirm
Jun 20, 2026
Merged

Confirm before destructive Galaxy history delete/purge (#338)#344
dannon merged 2 commits into
galaxyproject:mainfrom
dannon:fix/338-destructive-delete-confirm

Conversation

@dannon

@dannon dannon commented Jun 19, 2026

Copy link
Copy Markdown
Member

Closes #338.

The problem

A weak model, asked to delete two datasets, reached for the history-level delete tool (galaxy_update_history(deleted=true)) and then raw curl -- and nothing gated either. The brain exec-guard's policy ends in a catch-all that allows every non-local tool (Galaxy MCP included), and the web-mode gate allows every galaxy_* tool by prefix. So a whole ~10GB history was marked deleted with no are-you-sure. There's also no per-dataset delete tool (#228), which is what pushed the model toward the history-level tool and curl in the first place.

The fix

A shared classifier, shared/galaxy-destructive.js, that both tool_call gates consume so "what counts as destructive" has one home:

  • Orbit/CLI (exec-guard/policy.ts + gate.ts): a whole-history delete/purge routes to a new galaxy:destructive decision that always confirms, regardless of model tier -- it deliberately bypasses the usual weak->deny downgrade, because the danger in Destructive Galaxy history delete/purge runs with no confirmation -- agent purged a whole history when asked to delete two datasets #338 was silent execution, not the model proposing the action. Non-interactive sessions deny (no one to approve). The confirm is a yes/no that is never cached and never offered "allow for this session", with honest wording that distinguishes a recoverable soft-delete from an irreversible purge.
  • Remote/web (web/extensions/web-mode-gate.ts): destructive ops are hard-blocked -- that gate has no confirmation UI. A remote-confirm flow is a follow-up.

The classifier reads two reliable, structured shapes exactly -- a direct tool call and the adapter's generic mcp({tool, args}) proxy -- and adds best-effort guardrails for the free-form paths: a raw curl/wget DELETE against /api/histories/, and code mode's run_galaxy_tool(code=...) Python script. (Code mode isn't wired into this stack yet -- galaxy-mcp ships it behind --discovery-mode code and loom's adoption is separate -- so that branch is anticipatory.) The curl and code matchers are guardrails, not security boundaries: they catch the obvious reach-for-the-nearest-tool case, not deliberate obfuscation or a different HTTP client.

Review

The diff went through an adversarial review (a cross-family pass plus several focused lenses). It confirmed end-to-end that the gate fires against the real emitted tool names and that the always-ask-even-weak / non-interactive-deny / uncacheable / honest-wording properties hold on the reliable paths. It also drove a round of hardening to the best-effort matchers, now in this PR: string-quoted booleans (deleted='True', {"purge":"true"}), the code-mode kwargs / galaxy_-prefixed call_tool forms, purge carried in the DELETE body (not just the query), shell-variable history IDs, a whitespace-padded proxied tool name, and curl-regex over/under-matching (requires a curl/wget verb, excludes dataset-level /contents/, handles wget --method=DELETE).

Scope / follow-ups

Tests

TDD throughout. shared/galaxy-destructive unit tests (direct / mcp-proxy / code-mode / curl vectors, including the review findings); exec-guard-policy (asks even weak, denies non-interactive, proxy + code-mode, curl); exec-guard-gate (uncacheable confirm, purge vs delete wording); web-mode-gate (hard-block incl. code-mode). Full suite green; root + app typecheck clean.

…#338)

A weak model asked to delete two datasets reached for the history-level
delete tool and a raw curl instead, and nothing gated it: the exec-guard's
catch-all allowed every Galaxy MCP call and the web-mode gate allowed every
galaxy_ tool, so a whole ~10GB history got marked deleted with no are-you-sure.

This adds a shared classifier (shared/galaxy-destructive.js) that both
tool_call gates consume, so "what counts as destructive" lives in one place.
A whole-history delete/purge now always prompts regardless of model tier --
it deliberately skips the usual weak->deny downgrade, since the danger was
silent execution, not the model proposing it -- and a non-interactive session
denies because there's no one to approve. Orbit/CLI get a yes/no confirm
that's never cached and never offered "allow for this session", with honest
wording that distinguishes a recoverable soft-delete from an irreversible
purge. Remote/web mode hard-blocks these for now; its gate has no confirmation
UI, so a remote-confirm flow is left as a follow-up.

The classifier covers the direct tool call, the adapter's generic
mcp({tool,args}) proxy, and code mode's run_galaxy_tool(code=...) script, plus
a best-effort guardrail for a raw curl/wget DELETE against /api/histories/.
The curl and code matchers are guardrails, not boundaries -- they catch the
obvious reach-for-the-nearest-tool case, not deliberate obfuscation.

This doesn't add a per-dataset delete tool (galaxyproject#228). That missing tool is what
drove the dangerous workaround in the first place, so a proper delete_dataset
op upstream, plus populating galaxy-ops destructive metadata the classifier
can later defer to, are the complementary root-cause fixes.
@dannon dannon marked this pull request as ready for review June 20, 2026 01:08
A review pass over the gate turned up a few cheap gaps in the best-effort
matchers worth closing: string-quoted booleans (deleted='True',
{"purge":"true"}) slipped past the bare true checks; the code-mode matcher
only caught a positional call_tool('update_history', ...), so the kwargs form
and the galaxy_-prefixed name got through; and normalize() didn't trim, so a
whitespace-padded tool name defeated the mcp-proxy unwrap. Also lists the new
galaxy:destructive category in the PolicyResult comment.

The reliable structured paths were already solid -- this just tightens the
free-form curl/code guardrails against ordinary (non-adversarial) shapes.
@dannon dannon merged commit 9b88c18 into galaxyproject:main Jun 20, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Destructive Galaxy history delete/purge runs with no confirmation -- agent purged a whole history when asked to delete two datasets

1 participant