feat(runbooks): runbook execution engine + pup workflows PoC#146
Merged
platinummonkey merged 10 commits intomainfrom Mar 31, 2026
Merged
feat(runbooks): runbook execution engine + pup workflows PoC#146platinummonkey merged 10 commits intomainfrom
platinummonkey merged 10 commits intomainfrom
Conversation
Implements the runbooks PoC as specified:
- `pup runbooks list/describe/run/validate/import` — execute YAML
runbooks from ~/.config/pup/runbooks/ with {{ VAR }} templating,
sequential step execution, poll loops, and confirm gates
- `pup workflows run/instances list/get` — trigger Datadog Workflows
and poll to completion via raw REST (POST/GET /api/v2/workflows/...)
New files:
- src/runbooks/mod.rs — Runbook, Step, VarDef, PollConfig types
- src/runbooks/template.rs — {{ VAR }} and | default: "x" rendering
- src/runbooks/loader.rs — scan runbooks dir, load/import by name
- src/runbooks/engine.rs — sequential executor with polling, confirm,
on_failure handling, variable capture
- src/commands/runbooks.rs — list/describe/run/validate/import CLI
- src/commands/workflows.rs — trigger + watch, instances list/get
- docs/examples/runbooks/ — deploy-service, incident-triage,
maintenance-window reference templates
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…hints Each step now shows: - Header: ► Step N/M · <name> · <kind> [HH:MM:SS] with command preview - Labeled sections: ── stdout ── and ── stderr ── blocks wrapping output - Footer: ✓/✗/⊘ <elapsed> · next: step N/M — <name> (<kind>) - Summary line with total elapsed time and pass/fail count Shell steps surface non-empty stderr even on success so warnings from curl, grep, etc. aren't silently dropped. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Remove RULE constant and all long separator lines - Timestamps now show full UTC date+time: 2026-03-02 18:11:01 UTC - Step header: [N/M] <name> (<kind>) <timestamp> - Output labeled with indented "stdout:" / "stderr:" markers - Summary line: ✓/⚠ done <name> N/M steps <elapsed> <timestamp> Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…ch = "wasm32"))] Neither feature is compatible with wasm targets: - loader.rs uses std::fs and reqwest::Client::new() - engine.rs uses tokio::process::Command and chrono::Utc - both require dirs-based config path resolution (native-only) Gated items: - mod runbooks; in main.rs - pub mod runbooks/workflows; in commands/mod.rs - Commands::Runbooks/Workflows variants and their subcommand enums - dispatch arms in main_inner() Verified: native build, wasm32-wasip2 (wasi feature), and wasm32-unknown-unknown --lib (browser feature) all pass. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- workflows.rs: take main's typed SDK implementation (get/create/update/ delete/run/instance_list/instance_get/instance_cancel) over the runbooks branch's raw HTTP prototype - commands/mod.rs: drop wasm32 cfg gate on pub mod workflows (no longer needed with typed SDK impl) - main.rs: combine module declarations (runbooks + skills + tunnel), keep Runbooks command variant, use main's Workflows doc + dispatch, add LlmObs and ReferenceTables variants from main; remove duplicate WorkflowActions/WorkflowInstanceActions enums and stale Workflows dispatch arm introduced by HEAD - runbooks/loader.rs: replace serde_yaml with serde_norway (project std) - runbooks/engine.rs: pass empty query slice to raw_get (signature gained a query param in main) Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…headers - Step gains `body` (JSON template string) and `headers` (key/value map, templates rendered) fields alongside the existing `url` and `method` - client: add raw_request(cfg, method, path, body, extra_headers) covering GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS — any method reqwest supports; returns Null for 204/empty responses - engine: rewrite execute_http to render body + header templates, dispatch to raw_request for DD API paths (/...) and plain reqwest for external URLs; both paths honour all methods and extra headers Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Suggested in #143 (comment): --arg SERVICE=payments reads more naturally than --set for passing runtime variables to a runbook. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Templates live in ~/.config/pup/runbooks/_templates/<name>.yaml and define
any subset of step fields. A step references one with `template: <name>`;
its own fields take precedence over the template's (step wins on all scalar
fields; headers are merged per-key so step headers override template headers
with the same key).
# _templates/check-slo.yaml
kind: pup
run: slos get --id={{ SLO_ID }}
on_failure: warn
# runbook step
steps:
- name: Verify production SLO
template: check-slo # inherits kind + run + on_failure
on_failure: fail # overrides template's on_failure
- StepTemplate: new all-optional struct for template deserialization
- Step: kind now has #[serde(default)] so it can be omitted when a
template supplies it; new `template: Option<String>` field
- loader: templates_dir(), load_template(), apply_template(), resolve_steps()
helpers; load_runbook() now resolves templates before returning;
list_runbooks() skips the _templates/ subdir and any _-prefixed files
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Prefix unused error binding with _ in runbooks engine match arm - fmt reformatted loader.rs Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Request side:
- `content_type` — sets Content-Type header (default: application/json
when `body` is set, application/octet-stream when `body_file` is set)
- `body_file` — read request body from a file (template-rendered path);
takes precedence over `body`; intended for binary or large payloads
- `accept` — controls the Accept header (default: application/json)
Response side:
- JSON responses are pretty-printed as before
- YAML, CSV, plain-text and other text/* types are returned as-is UTF-8
- Binary responses require `output_file` to be set; an actionable error
is returned if it isn't
- `output_file` — write the raw response bytes to a file (template-rendered
path); returns "written N bytes to <path>" as step output
client: raw_request now takes raw bytes + explicit content_type/accept and
returns HttpResponse { content_type, bytes } instead of serde_json::Value.
engine: execute_http builds body bytes per content-type then delegates
response decoding to decode_http_response.
Both StepTemplate and the fill! macro in loader updated with the four new fields.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Runbook execution engine +
pup workflowscommand, proposed in #143. Adds two new command groups with no new dependencies.pup runbooksYAML runbooks live in
~/.config/pup/runbooks/. Each file defines sequential steps that mixpupcommands, shell tools, Datadog Workflow triggers, HTTP calls, and interactive confirm gates.Example runbook (
~/.config/pup/runbooks/hello.yaml):Output while running:
pup workflowsFull CRUD + execution for Datadog Workflow Automation via the typed SDK client:
--waitpolls every 2 s until terminal state. RequiresDD_API_KEY+DD_APP_KEY.Step kinds
puppupbinary with--output json; supportspoll:loopsshellsh -c "..."with template rendering; surfaces stderr even on successdatadog-workflowconfirm[y/N]; bypassed in agent modehttpHTTP step
Supports all methods (GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS), configurable content types for both request and response, file-based bodies, and binary response handling.
Request fields:
methodGETbodybody_filebodycontent_typeapplication/jsonwhenbodyset;application/octet-streamwhenbody_filesetContent-Typeacceptapplication/jsonAcceptheaderheadersResponse decoding:
output_fileset → raw bytes written to disk; step output is"written N bytes to <path>"*json*→ pretty-printed JSONtext/*,*yaml*,*csv*,*xml*,*html*→ raw UTF-8 stringoutput_file→ actionable error suggestingoutput_fileFor DD API paths (starting with
/), auth headers are added automatically. External URLs are sent without auth.Reusable step templates
Common step patterns live in
~/.config/pup/runbooks/_templates/<name>.yaml. Steps reference them withtemplate:; step fields override template fields;headersare merged per-key.Templates support all step fields including the new
content_type,accept,body_file, andoutput_file. The_templates/directory and any_-prefixed files are excluded frompup runbooks list.Control flow
{{ VAR }}and{{ VAR | default: "x" }}template substitution in all string fieldson_failure: warn | confirm | failper stepwhen: always | on_successto run cleanup steps after failureoptional: trueto swallow errors silentlycapture: VAR_NAMEto pipe step stdout into a variable for later stepspoll: { interval, timeout, until }with conditions:empty,status == X,value < N,decreasingReference runbooks
Three annotated examples in
docs/examples/runbooks/:deploy-service.yaml— SLO check → incident gate → DD Workflow trigger → monitor poll → Slack notifyincident-triage.yaml— fetch incident → search logs → check monitors → auto-mitigation workflow → shell diagnosticsmaintenance-window.yaml— create downtime (capture ID) → drain → metric poll → confirm → delete downtimeNot in this PoC
Platform support
pup runbooksis native-only, excluded from wasm builds via#[cfg(not(target_arch = "wasm32"))].pup workflowsuses the typed SDK and works on all targets.Testing
All existing tests pass (
cargo test,cargo clippy -- -D warnings,cargo fmt --check).Discussion: #143
🤖 Generated with Claude Code