chore: conform to workspace-config standard (.beads + .automaker)#161
Conversation
promote: dev -> staging
promote: staging -> main
First tagged release. Contents of community-improvements project: M1 — Security Hardening (A2A bearer auth, audit redaction, origin verification) M2 — Memory On By Default (session persistence + load-on-start) M3 — Skill Loop (skill-v1 emission + SQLite FTS5 index + curator) Plus: .gitignore cleanup for .automaker-lock + .worktrees, docs coverage of security layer, skill-loop architecture, and new env vars. Manual bump because prepare-release.yml requires GH_PAT secret (not configured). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
promote: dev -> staging (bug fixes v0.2.1)
promote: staging -> main (bug fixes v0.2.1)
Bug fixes from v0.2.0 smoke testing: - Agent card now advertises bearer scheme when A2A_AUTH_TOKEN is set - Session memory persistence actually fires (moved from unreachable on_session_end to after_agent) - Test suite collects cleanly in fresh Docker env - MemoryMiddleware activates standalone (without knowledge_store) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Writes the deployed GitHub Pages URL back to the repo's `homepage` field so it renders in the About sidebar on the repo page. Co-authored-by: Automaker <automaker@localhost> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Elevates langgraph-config.yaml + SOUL.md into a typed form inside the Gradio sidebar so forks can iterate on model / temperature / tools / middleware / persona without a code edit + restart. Save rebuilds the compiled graph in place; in-flight turns finish on the prior graph. The model dropdown is populated from the connected gateway's `/v1/models` endpoint so forks always see what's actually available through the configured api_base + api_key, no hardcoded list to drift. graph/config_io.py is the new I/O layer: YAML round-trip preserves comments and unknown top-level sections (the shipped YAML's memory/skills blocks that the dataclass doesn't model), dual-location SOUL.md handling writes to both /sandbox/SOUL.md (runtime) and config/SOUL.md (source), and gateway model discovery returns a readable error string instead of raising when the endpoint is down. Also exposes GET/POST /api/config + GET /api/config/models for external control, and falls SOUL back to config/SOUL.md in graph/prompts.py so local dev without a /sandbox mount still picks up drawer edits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Turns the "fork a template and edit code" onboarding into a download-and-run flow. A fresh clone boots without any env vars, lands in a 4-step wizard (Connect / Identity / Tools / Profile), and writes out config + SOUL.md + a .setup-complete marker on Launch — the chat UI then appears on the same page, drawer pre-populated with the wizard's values. Key pieces: - Wizard UI in chat_ui.py: visibility-toggled wizard pane vs chat pane, populated from the live config so re-runs pre-fill. 4 ship-with presets in config/soul-presets/ (generic-assistant, research, coding, blank) power the persona dropdown. - Lazy graph init in server.py: no model required at boot. The chat endpoints return a friendly "setup required" message until the wizard completes. After wizard save, the marker is flipped BEFORE the graph reload so the rebuild actually compiles (this order matters — earlier iteration reloaded before marking complete and left _graph=None). - Identity/auth/runtime sections added to LangGraphConfig so the wizard-captured name, operator, A2A token, and autostart flag round-trip through the existing YAML infrastructure. agent_name() resolver prefers YAML identity.name → env → "protoagent" so the agent card + OpenAI-compat model id reflect the wizard value without a process restart. - autostart.py: macOS LaunchAgent install/uninstall with Linux/Windows stubs. Captures sys.executable at install time so venv-based runs survive a reboot. Opt-in via wizard checkbox; toggle from drawer anytime. - Dockerfile: config volume declared so setup persistence survives docker run without a -v flag. - Docs: first-agent.md rewritten for clone→pip→run→wizard flow; old fork/sed/docker content moved to new customize-and-deploy.md guide. Tests: 29 passing (7 new — setup marker lifecycle, preset discovery, preset content shape, shipped starter presence). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Without this, a browser refresh after the marker was written externally (POST /api/config/setup, or /api/config/reset-setup from another tab) kept Gradio serving its initial visibility snapshot — wizard visible even though setup is done, or vice versa. app.load runs per page visit so visibility tracks is_setup_complete() live. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
promote: dev → staging (WAF UA fix)
Cloudflare's managed WAF in front of api.proto-labs.ai (and likely other gateways behind a default WAF config) blocks the OpenAI Python SDK's `OpenAI/Python <ver>` User-Agent with a 403 "Your request was blocked". /v1/models went through fine because the gateway's model-list handler doesn't gate on UA the same way — only /v1/chat/completions 403'd, which made this look like a key or model-alias problem rather than what it actually was. tools/lg_tools.py already sets a custom UA on its outbound httpx fetches for exactly this reason; graph/llm.py had no equivalent, so ChatOpenAI fell back to the SDK default. Threading the same identifier through default_headers makes every protoAgent egress present a consistent allowlisted UA. Verified: product-director wizard → chat turn → 200 OK from api.proto-labs.ai with the groq-llama-70b alias. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
promote: staging → main (WAF UA fix)
Critical — path traversal in preset loader (graph/config_io.py:
read_soul_preset):
Inputs like "../secret" escaped config/soul-presets/ and read
arbitrary .md files anywhere on disk. Resolve both the preset
root and the candidate path and require the latter live inside
the former before reading. 7 parametrized tests cover the
malicious inputs I could think of (../, ../../, absolute paths,
bare "..", mid-path ../../).
Major — YAML auth.token was non-functional for A2A bearer:
register_a2a_routes captured _a2a_token at register time, so
wizard-set tokens were ignored until process restart. Promoted
_a2a_token to a module-level mutable holder (_A2A_TOKEN: list)
that the closure reads on every request, added set_a2a_token()
as the public mutator, and a new auth_token= arg to
register_a2a_routes as the seed source (env still the fallback).
server.py's reload path now calls set_a2a_token on every YAML
change so the wizard → live bearer enforcement flow works with
no restart — verified: fresh boot open → wizard token set →
401 on wrong token / 200 on right → drawer clears token → open
again.
Major — plist XML injection in autostart.py:
Agent names containing <, >, & produced malformed plists (and
could theoretically inject nodes). xml.sax.saxutils.escape()
every interpolated string field before embedding.
Major — install_autostart defaulted to port 7870 regardless of
--port flag (autostart.py / server.py):
Captured the active port in a module-level _active_port at
_main() time and threaded it through both finish_setup's
autostart sync and the drawer's toggle_autostart callback. The
generated LaunchAgent now reboots on whatever port the operator
launched with.
Minor — chat_ui polish:
* Numeric fields (max_tokens, max_iterations, worker_max_turns)
fall back to sensible defaults (4096/50/20) instead of 0 when
cleared — validate_config_dict rejects zero, so "or 0" blocked
legitimate saves with a confusing validation error.
* _sync_visibility no longer aliases the sidebar output slot to
wizard_pane when the sidebar is absent; split into two closures
with matching output arities so Gradio doesn't receive duplicate
updates to the same component.
* Legacy load_provider_choices handler guards get_current_provider
existence — KeyError risk when a fork provides get_provider_choices
alone.
Nitpicks:
* Remove unused _FIELD_MAP from config_io.py.
* ASCII hyphen (U+002D) instead of en-dash (U+2013) in the
temperature validation error.
* Pin ruamel.yaml>=0.18 in Dockerfile to match requirements.txt.
* Document the VOLUME anonymous-volume lifecycle and named-volume
recommendation in the Dockerfile comment.
Not addressed (deliberate):
* CodeRabbit flagged test_list_gateway_models_http_error as
expecting httpx.ConnectError to be caught by except httpx.HTTPError
— false positive, ConnectError → NetworkError → TransportError →
RequestError → HTTPError, test already passes.
* "Reuse config_io.read_soul() in graph/prompts.py" — kept the
inline check to avoid introducing an import dependency from
prompts.py (loaded early, widely used) into config_io.py.
* "Use tuple form for @pytest.parametrize" — stylistic; comma-
separated string works identically.
Test surface: 36 passing (7 new — the path-traversal parametrize set).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Major — atomic graph reload (server.py::_reload_langgraph_agent):
Previously swapped _graph_config + set_a2a_token BEFORE calling
create_agent_graph, so a failed build would leave the running
_graph pinned to the OLD agent but reporting the NEW config and
rotated bearer token. Now build first; commit config/auth/graph
state only on success.
Major — rollback marker on failed first-run reload (finish_setup):
mark_setup_complete fires before the reload so the graph compiles.
If the reload fails, the marker stays and the next page load
drops the user into chat with _graph=None and no obvious recovery
path. finish_setup now reset_setup()s on reload failure, so the
wizard returns for a retry.
Major — sanitize agent_name for plist path (autostart.py::_macos_label):
Prior sanitization only lowercased + replaced spaces. `/` and `..`
survived, so an agent name with path-traversal chars could target
arbitrary paths relative to ~/Library/LaunchAgents/ on install /
status / uninstall. Strip input to [a-z0-9_.-] and fall back to
"protoagent" when the result is empty/dots-only. Verified resolved
plist path stays inside LaunchAgents/ for every path-traversal
payload I could think of.
Major — gateway api_key off query string (POST /api/config/models):
GET with ?api_key=... leaks credentials into browser history,
reverse-proxy access logs, and uvicorn's own request log. Switched
to POST taking a small ModelsProbeRequest body. Empty body still
falls back to stored config so the drawer's initial render works.
Major — round-trip identity/security/autostart through the drawer
(chat_ui.py + server.py):
Drawer previously only edited model/worker/middleware/knowledge/
SOUL, leaving the wizard's agent name, operator, bearer token,
and autostart flag with no post-setup edit path. Added three new
accordion sections (Identity / Security / Autostart), wired them
into _config_components, _load_all, and _save.
Extracted _sync_autostart_with_config so the wizard's finish_setup
and the drawer's save_all both drive the LaunchAgent install/
uninstall from the same code path — flipping the drawer's
Autostart checkbox + Save & Reload now installs/removes the plist
the same way the wizard does.
Verified end-to-end on product-director:
* Fresh clone → wizard → "pd-renamed" via drawer → agent card
says pd-renamed, old bearer token → 401, new bearer token → 200
* Invalid temperature (99) → rejected at validation; YAML +
marker untouched
* Path traversal via /api/config/presets/../secret → 404
Tests: 36 passing (no new cases — existing coverage was already
sufficient for these fixes).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(ui): first-run setup wizard + live-edit config drawer
The template now ships a working memory loop and end-to-end eval suite on day one so forks have a green baseline before they touch a single line of code. Closes #154. What lands: - knowledge/store.py — sqlite + FTS5 (LIKE fallback). One ``chunks`` table backs operator notes (memory_ingest), daily-log entries (daily_log), and conversation findings extracted by MemoryMiddleware (domain='finding'). Path resolves env > config > default with an automatic ~/.protoagent/ fallback when /sandbox isn't writable. - tools/lg_tools.py — five new memory tools (memory_ingest, memory_recall, memory_list, memory_stats, daily_log) bound to the store via a closure factory so tests get a fresh store per run. ``echo`` removed; ``get_all_tools(knowledge_store)`` actually uses its parameter now. - server.py — _build_knowledge_store() constructs the store and threads it through both initial init and the drawer reload path. Defaults flipped: knowledge_middleware + memory_middleware now ON by default (config/langgraph-config.yaml + graph/config.py). - evals/ — A2A client + runner + verify helpers + 15 starter cases (tasks.json) covering agent card discovery, bearer auth gating, abstention, every shipped tool, KB recall, a chained two-tool case, and KnowledgeMiddleware injection. Side-effect-verified: audit log + reply text + KB chunks all checked independently so hallucinated tool results get caught. - docs/guides/evals.md — full how-to. README/TEMPLATE/configuration/ starter-tools/first-agent updated to reflect the new defaults and the additional five memory tools. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Real bugs: - evals/runner.py: teardown now runs in a finally block so seeded KB rows get cleaned up even when the verifier or client.ask() raises. expected_tools=[] now means "assert no tools fired" (was conflated with "no key" via the `or []` short-circuit, making the abstention case a no-op). - evals/runner.py + tasks.json: added a `stream` runner kind so AgentClient.stream() is reachable from tasks.json — new streaming_status_updates case asserts the SSE event sequence. - knowledge/store.py: PRAGMA journal_mode=WAL is now best-effort (read-only DBs no longer break _connect). FTS5 rebuild after schema install so an existing chunks table populated before FTS was added gets indexed. find_chunk_containing/delete_by_content reject empty/whitespace-only inputs to prevent LIKE '%%' wildcards from matching every row. Hardening: - tools/lg_tools.py: clamp memory_recall(k) to [1, 20] and memory_list(limit) to [1, 200] so the agent can't request arbitrarily large slices of the KB. Doc cleanup: - docs/guides/subagents.md: LangGraphConfig snippet had a stale "echo" reference; replaced with the new memory-tool list. - docs/tutorials/first-tool.md: WORKER_CONFIG example now appends git_sha alongside the bundled defaults instead of replacing them and dropping the memory tools. - docs/reference/starter-tools.md: "adding your own" snippet now preserves the conditional _build_memory_tools(knowledge_store) extension. - tests/test_config_io.py: starter-tool contract assertion now also covers web_search. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- evals/runner.py: collapse redundant nested teardown guard into a single `if "teardown" in case:` (SIM102); remove now-unused `setup_applied` flag - knowledge/store.py: use `datetime.UTC` alias (Python 3.11+, UP017) - tools/lg_tools.py: add `-> list` return annotation to `_build_memory_tools` (ANN202); replace explicit loop with list comprehension in `memory_recall` (PERF401) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> https://claude.ai/code/session_01148o8ppbuQwuZBsVGTQWwQ
Real bugs: - evals/runner.py: setup is now inside the try block so a partial setup failure (e.g. step 2 of 3 errors) still triggers the finally teardown — rows from steps that did succeed no longer leak into the next case. Was flagged as a duplicate from round 2. - knowledge/store.py: LIKE patterns now escape % and _ via ESCAPE '\' on every clause that takes user input (find_chunk_containing, delete_by_content, _search_like). A query for "100%" or "hello_world" no longer silently matches every row containing "100" or any single character between "hello" and "world". - knowledge/store.py: FTS5 MATCH tokens are now double-quoted via _fts_quote() so user-supplied query terms can't smuggle FTS5 operators (column filters, prefix wildcards, NEAR, AND/OR/NOT). Defence in depth — the [\w']+ tokenizer already filters most special chars. Hardening: - evals/runner.py: the fixed 0.3s asyncio.sleep waiting for the audit log to flush is gone. _await_audit_assertion now polls every 50ms up to a 2s deadline and returns as soon as the assertion passes — exits early on success, only burns the full deadline when the tool genuinely never fired. - evals/runner.py: _run_auth_check accepts case["headers"] so cases can override the default bearer-only header set and exercise X-API-Key auth scenarios (or both auths together). - knowledge/store.py: per-method exception handlers broadened from sqlite3.OperationalError to sqlite3.DatabaseError. Catches IntegrityError, ProgrammingError, and corruption variants too without crashing the agent loop. _has_fts5 (probe) and _connect (connection-time errors only) keep the narrower OperationalError. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- evals/runner.py: use asyncio.get_running_loop() instead of the deprecated get_event_loop() inside the _await_audit_assertion coroutine - evals/runner.py: prefix unused _entries return value with underscore - evals/runner.py: use datetime.UTC alias (consistent with store.py), drop now-unused timezone import - knowledge/store.py: broaden _get_db exception catch from OperationalError to DatabaseError so corrupt-DB errors are swallowed per the module's no-crash contract - knowledge/store.py: replace log.error with log.exception in all three DatabaseError handlers (schema init, _get_db, add_chunk) so tracebacks appear in error logs Co-Authored-By: Claude <claude@anthropic.com> https://claude.ai/code/session_01YW5U6mtpLy4rzKmqd4trkH
…ness feat: ship default knowledge store + eval harness for forks
Adds a default scheduler so agents can defer work to themselves —
"remind me tomorrow", recurring sweeps, deadline check-ins. Three
new tools land in get_all_tools() when a backend is wired up:
schedule_task, list_schedules, cancel_schedule.
Two backends ship behind a single SchedulerBackend protocol:
- LocalScheduler (default): sqlite + asyncio polling. Per-agent
jobs.db at /sandbox/scheduler/<agent_name>/ with a
~/.protoagent/scheduler/<agent_name>/ fallback. Fires by POSTing
message/send to the running agent's own /a2a endpoint, going
through bearer + X-API-Key auth like a real caller (audit log +
cost-v1 capture work the same). Cron expressions reschedule via
croniter; ISO datetimes are one-shot. Missed-fire recovery:
within 24h fires immediately, older fires roll forward without
firing.
- WorkstaceanScheduler: HTTP adapter to a Workstacean install's
POST /publish. Activated automatically when
WORKSTACEAN_API_BASE and WORKSTACEAN_API_KEY env vars are set.
Topic and job IDs are namespaced cron.<agent_name>.<job_id>
so a single Workstacean can serve N protoAgent forks safely.
Multi-agent isolation is the headline architectural property —
spinning up gina-personal alongside gina-work on the same box (or
sharing one Workstacean) won't cross-fire scheduled prompts.
Verified with explicit tests in test_scheduler_local.py.
Wiring:
- scheduler/{__init__,interface,local,workstacean}.py — module
- tools/lg_tools.py — _build_scheduler_tools factory; get_all_tools
takes a new optional scheduler= kwarg
- graph/agent.py — create_agent_graph and create_simple_agent
accept scheduler=
- server.py — _build_scheduler() picks backend at boot,
on_event("startup"/"shutdown") drives the polling task lifecycle,
reload path reuses the running scheduler instance
- config/langgraph-config.yaml + graph/{config,subagents/config}.py
— worker subagent gets the three new tools in its allowlist
- requirements.txt — croniter>=2.0
Tests: 48 new (test_scheduler_local.py covers add/list/cancel,
multi-agent isolation, reschedule-vs-delete, missed-fire recovery,
and an end-to-end fire path with httpx mocked; test_scheduler_workstacean.py
covers all the publish payload assertions, namespacing,
custom topic prefix, and HTTP error handling).
Docs: docs/guides/scheduler.md (Diataxis how-to with the firing
model, multi-agent story, env reference, and notes on the
Workstacean A2A-bridge gap), plus index/configuration/README/
TEMPLATE updates.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- docs/guides/scheduler.md: replace jargon "multiple ginas" with "multiple agents" - scheduler/__init__.py: sort __all__ lexicographically (RUF022) - scheduler/local.py: log.error → log.exception in _init_db to preserve traceback (TRY400) - scheduler/workstacean.py: correct stale module docstring that claimed list_jobs() issues a list command — it returns [] unconditionally - server.py: add -> None return annotations to _scheduler_startup/_scheduler_shutdown (ANN202) - tests: add match= to two bare pytest.raises(ValueError) calls (PT011) - tools/lg_tools.py: wrap blocking scheduler calls in asyncio.to_thread() to avoid blocking the event loop under concurrent load; fix cancel_schedule error message to not conflate transport/DB failures with "no such job" Co-Authored-By: claude-code <claude@anthropic.com> https://claude.ai/code/session_01JmFYJSYRMRndZ43g3AYW2q
Real bugs: - scheduler/local.py: _fire() now returns bool (True on 2xx, False on HTTP error or network exception). _tick() only reschedules / deletes when _fire() succeeds, so a transient failure leaves the job in place for the next tick to retry. Previously a one-shot job hit by a 5xx silently vanished. - server.py: the API key env var name now uses AGENT_NAME_ENV.upper() to match the auth handler at L893. The previous code read agent_name() (which returns the wizard-set identity.name when set), so a wizard rename pointed the scheduler at <RENAMED>_API_KEY while the auth handler still expected <ENV>_API_KEY → self-invocation 401'd silently after every wizard rename. - server.py: reload path now constructs a scheduler when _scheduler is None (first-run case: server boots pre-setup, wizard finishes, drawer triggers reload — this is when we *first* construct the scheduler). Existing instances are still reused — drawer saves don't tear down the polling loop. Surface: - tools/lg_tools.py: exported SCHEDULER_TOOL_NAMES and MEMORY_TOOL_NAMES as module constants. - graph/config_io.py::list_available_tools: now exposes scheduler + memory tool names to the wizard's checkbox group even when the runtime hasn't yet constructed the underlying backends. Otherwise the wizard would hide tools that the runtime registers as soon as the user finishes setup. Declined: - scheduler/local.py L141-149: CodeRabbit asked to re-raise sqlite3.DatabaseError from _init_db. The store is intentionally fail-soft (matches knowledge/store.py + audit.py): _resolve_db_path already falls back to ~/.protoagent/scheduler/ when /sandbox is unwritable, and re-raising would crash boot in exactly the scenario the fallback is designed to handle. The graceful degradation contract is "scheduler tools return errors when storage is broken, agent keeps serving everything else". Tests: - tests/test_scheduler_local.py: new test_fire_failure_leaves_job_in_place regression guard + test_fire_returns_bool contract test. - tests/test_config_io.py: list_available_tools assertions now check for memory + scheduler tools and no duplicates. 86 scheduler-scope tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…h knowledge/memory)
Scheduler had asymmetric opt-out — only env-based (SCHEDULER_DISABLED=1).
The knowledge and memory subsystems already exposed YAML toggles
(middleware.knowledge, middleware.memory) so forks could flip them
through the drawer or wizard. Scheduler now matches:
- LangGraphConfig.scheduler_enabled: bool = True (default-on)
- from_yaml() reads middleware.scheduler
- config_to_dict() emits it for the drawer round-trip
- config/langgraph-config.yaml has middleware.scheduler: true
- server.py::_build_scheduler honors the YAML toggle first, env second
Both subsystems are now genuinely opt-out:
middleware:
knowledge: true # was already so
memory: true # was already so
scheduler: true # NEW — was env-only
audit: true
Drawer/wizard can flip any of them without restart (the existing
reload path already rebuilds on config change). The env opt-out
(SCHEDULER_DISABLED=1) stays as a runtime escape hatch for fleet
operators who can't edit YAML in the moment.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Real bugs:
- scheduler/local.py: stop() now suppresses only asyncio.CancelledError
(the expected outcome of cancelling our own task) and logs any
other exception via log.exception. Previously every exception
was silently swallowed, so a polling-loop crash during shutdown
would vanish without a trace.
- server.py: reload path now honors the new middleware.scheduler
toggle. Three states:
- flipped OFF (was on) → stop + drop the running scheduler;
new graph builds with scheduler=None.
- flipped ON (was off / first run) → construct + start.
- unchanged → reuse the running instance.
Helpers _start_scheduler_async / _stop_scheduler_async fire
start()/stop() onto the active loop without forcing the entire
reload chain to become async.
Type / nits:
- server.py: added `-> "SchedulerBackend | None"` return type to
_build_scheduler, with a TYPE_CHECKING import to avoid runtime
cycles.
- tests/test_scheduler_local.py: raw-string regex for `|`
alternation (test_malformed_raises); added match= to the two
bare ValueError tests (test_empty_prompt_rejected,
test_malformed_schedule_rejected) so they only pass for the
intended error message.
- tests/test_config_io.py: assert list_schedules in names alongside
schedule_task / cancel_schedule.
- docs/reference/configuration.md: clarified the scheduler opt-out
description — middleware.scheduler is canonical, SCHEDULER_DISABLED
is a runtime escape hatch.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Real bugs:
- scheduler/local.py::_fire(): metadata moved from
params.message.metadata to params.metadata. The A2A handler
reads custom metadata from params.metadata
(a2a_handler.py L1244 — `msg_metadata = params.get("metadata")`),
so the previous nesting silently dropped the
scheduler_job_id / scheduler_kind breadcrumb. Observers now get
it as intended.
- server.py reload path: scheduler swap is now planned before the
graph rebuild and only committed after rebuild succeeds. A
failed graph rebuild used to leave the scheduler torn down or a
fresh one already started, dis-aligning runtime state. The new
ordering: build candidate, rebuild graph (rollback-safe on
failure), commit graph + scheduler atomically.
- scheduler/local.py: _resolve_db_path now sanitizes agent_name
via a new _safe_segment() helper. Strips path separators, ``..``,
and absolute-path prefixes; falls back to "default" when nothing
usable remains. Defence in depth — the value comes from operator-
controlled env / YAML, but a typo or copy-paste shouldn't be able
to put a sqlite file outside the configured scheduler dir.
Tests:
- tests/test_scheduler_local.py::test_cron_rescheduled_after_fire:
pinned to a fixed fired_at timestamp so the assertion is exact
(next_fire == "2026-04-29T09:00:00+00:00") instead of a
"different from original" near-tautology that depends on
datetime.now().
Docs:
- docs/reference/configuration.md: clarified that the scheduler's
enable/disable lives in YAML (middleware.scheduler), while
backend selection and runtime knobs are env-driven. Repositioned
SCHEDULER_DISABLED as the runtime escape hatch.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat: pluggable scheduler (local sqlite + Workstacean adapter)
After PRs #155 (default KB store + memory tools) and #156 (default scheduler), the docs claimed nine tools, missed scheduler tools entirely in the reference, and skipped scheduler env vars. This pass syncs every stale claim flagged by the audit. Updates: - docs/reference/starter-tools.md - Corrected count: nine → twelve - New tool sections: schedule_task, list_schedules, cancel_schedule (signatures, output formats, multi-agent isolation notes) - "adding your own" snippet now threads scheduler= through get_all_tools alongside knowledge_store= - Related links include the scheduler guide - docs/reference/environment-variables.md - New "Knowledge store" section: KNOWLEDGE_DB_PATH override + the ~/.protoagent fallback - New "Audit log" section: AUDIT_PATH (used by evals/verify.py) - New "Scheduler" section: WORKSTACEAN_API_BASE/KEY/TOPIC_PREFIX, SCHEDULER_DB_DIR/INVOKE_URL/DISABLED, plus the protoLabs operators callout pointing at the ava node + secrets manager for the actual key - docs/tutorials/first-agent.md - Wizard description now mentions all twelve tools and the four middleware toggles (added Scheduler alongside Audit/Memory/ Knowledge) - docs/tutorials/first-tool.md - "Where to go next" link copy: five → twelve - docs/guides/fork-the-template.md - Tool list paragraph corrected to all twelve, with the binding-by-backend split called out - docs/guides/customize-and-deploy.md - "Add domain tools" section now mentions memory + scheduler tool binding and the middleware.* toggles for opt-out - README.md - Starter tools row now lists all twelve, grouped 4+5+3 with backend bindings shown Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…line) Scaffolds .beads/issues.jsonl + .automaker/settings.json + standard .gitignore (narrowing any blanket ignore) via release-tools init-workspace-config. Any remaining runner-rule errors need per-job migration to namespace-profile-protolabs-linux.
mabry1985
left a comment
There was a problem hiding this comment.
QA Audit — PR #161 | chore: conform to workspace-config standard (.beads + .automaker)
VERDICT: PASS
CI Status
- No CI checks configured for this PR (acceptable for config-only scaffold changes)
Diff Review
- Adds
.automaker/settings.json— minimal{"version": 1}baseline conforming to workspace-config standard. - Updates
.gitignore— replaces blanket/.automaker/ignore with 4 targeted entries. Correct trade-off: runtime subdirs (features/, checkpoints/, trajectory/, beads.db) remain ignored while the settings config can be committed.
Observations
- LOW: No CI checks on PR. Acceptable for infrastructure/config-only changes; no behavior change to test.
- LOW:
release-tools#18tracks runner-rule errors separately — file-level verification errors are cleared here.
Safety rail — Quinn cannot self-approve. This PR appears to be from the same deployment context as Quinn. Formal review is COMMENTED (PASS) rather than APPROVED. Please have a human reviewer confirm and merge manually.
Checks: 1
Passed: 1 (diff review)
Failed: 0
Gaps: 0
— Quinn, QA Engineer
|
Submitted COMMENT review on |
Migrate plain-Linux jobs to namespace-profile-protolabs-linux; annotate jobs that genuinely require GitHub-hosted infra. MIGRATED: - docs.yml build (vitepress docs build + upload-pages-artifact) - prepare-release.yml prepare (python version bump + gh PR ops) ANNOTATED (allow-hosted-runner): - docker-publish.yml build-and-push — docker buildx + registry push - docs.yml deploy — GitHub Pages deploy needs hosted Pages environment - release.yml release — docker buildx + registry push (+ OIDC attestation) Also finishes the .beads scaffold that the conform commit referenced but never landed: narrow the blanket *.jsonl ignore so .beads/issues.jsonl is committed (workspace-config requires the git-friendly export tracked). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
mabry1985
left a comment
There was a problem hiding this comment.
QA Audit — PR #161 | chore: conform to workspace-config standard (.beads + .automaker)
VERDICT: PASS (filed as COMMENT — GitHub rejected APPROVE as self-review; human approval required to merge)
CI Status
- No checks reported (head
ccd6656) — expected: chore touches.automaker/,.beads/,.gitignore, and workflow YAML; existing workflows are path-filtered or tag/push-triggered and don't fire on this diff.
Diff Review
- Scaffolds baseline: new
.automaker/settings.json({"version":1}) and empty.beads/issues.jsonlplaceholder. .gitignorecorrectly narrowed: removes blanket.automaker/ignore, adds targeted.automaker/{features,checkpoints,trajectory}/+.beads/beads.db, preserves!.beads/issues.jsonlnegation so the git-friendly export stays committable.- Workflow runner standardization:
docs.ymlbuildandprepare-release.ymlprepareswitch tonamespace-profile-protolabs-linux; jobs that legitimately need GitHub-hosted runners (docker registry push indocker-publish.yml/release.yml, GitHub Pages deploy indocs.yml) keepubuntu-latestwith explicit# workspace-config: allow-hosted-runnerannotations explaining why. - Remainder is quote-style normalization (single→double) — cosmetic, no behavior change.
Observations
- BLOCKING (process): GitHub rejected automated APPROVE — PR author matches reviewer identity. Needs a human approver before merge.
- LOW:
!.beads/issues.jsonlnegation references a blanket*.jsonlrule not visible in the diff context; if no such rule exists elsewhere it's harmless/defensive. - LOW: Runner switch assumes
namespace-profile-protolabs-linuxis provisioned for this repo — consistent with workspace-config standard but unverified from the diff alone. - LOW: No CodeRabbit findings; clawpatch skipped (protoAgent not in mounted set; diff is config-only).
- Runner-rule errors deferred to release-tools#18 per PR description — appropriate scope split.
— Quinn, QA Engineer
Conform to workspace-config standard
Brings protoAgent into fleet workspace-config conformance:
.beads+.automakerbaseline and GitHub Actions runner migration to the org-owned runner (namespace-profile-protolabs-linux).Runner migration
Every job runs on the org-owned runner unless it genuinely needs GitHub-hosted infra (annotated with
# workspace-config: allow-hosted-runner <reason>).docker-publish.yml→build-and-pushdocs.yml→builddocs.yml→deployprepare-release.yml→preparerelease.yml→release.beads scaffold fix
The prior conform commit's message referenced scaffolding
.beads/issues.jsonl, but it was never committed — the blanket*.jsonlignore swallowed it. This PR narrows that ignore with!.beads/issues.jsonland commits the empty export, so the git-friendly issue store is tracked per the standard.Verify gate
verify-workspace-config --root .→ conformant, 0 errors (3 recognized hosted-runner exceptions).🤖 Generated with Claude Code