Skip to content

chore: conform to workspace-config standard (.beads + .automaker)#161

Merged
mabry1985 merged 34 commits into
devfrom
chore/conform-workspace-config
May 25, 2026
Merged

chore: conform to workspace-config standard (.beads + .automaker)#161
mabry1985 merged 34 commits into
devfrom
chore/conform-workspace-config

Conversation

@mabry1985
Copy link
Copy Markdown
Contributor

@mabry1985 mabry1985 commented May 25, 2026

Conform to workspace-config standard

Brings protoAgent into fleet workspace-config conformance: .beads + .automaker baseline and GitHub Actions runner migration to the org-owned runner (namespace-profile-protolabs-linux).

Runner migration

Every job runs on the org-owned runner unless it genuinely needs GitHub-hosted infra (annotated with # workspace-config: allow-hosted-runner <reason>).

Workflow / Job Decision Reason
docker-publish.ymlbuild-and-push ANNOTATED docker buildx build + registry push
docs.ymlbuild MIGRATED plain npm vitepress build + upload-pages-artifact
docs.ymldeploy ANNOTATED GitHub Pages deploy requires the hosted Pages environment
prepare-release.ymlprepare MIGRATED python version bump + git/gh PR ops, no Docker
release.ymlrelease ANNOTATED docker buildx build + registry push (also OIDC build provenance/attestation)

.beads scaffold fix

The prior conform commit's message referenced scaffolding .beads/issues.jsonl, but it was never committed — the blanket *.jsonl ignore swallowed it. This PR narrows that ignore with !.beads/issues.jsonl and commits the empty export, so the git-friendly issue store is tracked per the standard.

Verify gate

verify-workspace-config --root .conformant, 0 errors (3 recognized hosted-runner exceptions).

🤖 Generated with Claude Code

mabry1985 and others added 30 commits April 19, 2026 14:30
First tagged release. Contents of community-improvements project:

M1 — Security Hardening (A2A bearer auth, audit redaction, origin verification)
M2 — Memory On By Default (session persistence + load-on-start)
M3 — Skill Loop (skill-v1 emission + SQLite FTS5 index + curator)

Plus: .gitignore cleanup for .automaker-lock + .worktrees, docs coverage of
security layer, skill-loop architecture, and new env vars.

Manual bump because prepare-release.yml requires GH_PAT secret (not configured).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
promote: dev -> staging (bug fixes v0.2.1)
promote: staging -> main (bug fixes v0.2.1)
Bug fixes from v0.2.0 smoke testing:
- Agent card now advertises bearer scheme when A2A_AUTH_TOKEN is set
- Session memory persistence actually fires (moved from unreachable on_session_end to after_agent)
- Test suite collects cleanly in fresh Docker env
- MemoryMiddleware activates standalone (without knowledge_store)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Writes the deployed GitHub Pages URL back to the repo's `homepage`
field so it renders in the About sidebar on the repo page.

Co-authored-by: Automaker <automaker@localhost>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Elevates langgraph-config.yaml + SOUL.md into a typed form inside the
Gradio sidebar so forks can iterate on model / temperature / tools /
middleware / persona without a code edit + restart. Save rebuilds the
compiled graph in place; in-flight turns finish on the prior graph.

The model dropdown is populated from the connected gateway's
`/v1/models` endpoint so forks always see what's actually available
through the configured api_base + api_key, no hardcoded list to drift.

graph/config_io.py is the new I/O layer: YAML round-trip preserves
comments and unknown top-level sections (the shipped YAML's
memory/skills blocks that the dataclass doesn't model), dual-location
SOUL.md handling writes to both /sandbox/SOUL.md (runtime) and
config/SOUL.md (source), and gateway model discovery returns a
readable error string instead of raising when the endpoint is down.

Also exposes GET/POST /api/config + GET /api/config/models for
external control, and falls SOUL back to config/SOUL.md in
graph/prompts.py so local dev without a /sandbox mount still picks
up drawer edits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Turns the "fork a template and edit code" onboarding into a
download-and-run flow. A fresh clone boots without any env vars,
lands in a 4-step wizard (Connect / Identity / Tools / Profile),
and writes out config + SOUL.md + a .setup-complete marker on
Launch — the chat UI then appears on the same page, drawer
pre-populated with the wizard's values.

Key pieces:

- Wizard UI in chat_ui.py: visibility-toggled wizard pane vs
  chat pane, populated from the live config so re-runs pre-fill.
  4 ship-with presets in config/soul-presets/ (generic-assistant,
  research, coding, blank) power the persona dropdown.
- Lazy graph init in server.py: no model required at boot. The
  chat endpoints return a friendly "setup required" message
  until the wizard completes. After wizard save, the marker is
  flipped BEFORE the graph reload so the rebuild actually
  compiles (this order matters — earlier iteration reloaded
  before marking complete and left _graph=None).
- Identity/auth/runtime sections added to LangGraphConfig so
  the wizard-captured name, operator, A2A token, and autostart
  flag round-trip through the existing YAML infrastructure.
  agent_name() resolver prefers YAML identity.name → env →
  "protoagent" so the agent card + OpenAI-compat model id
  reflect the wizard value without a process restart.
- autostart.py: macOS LaunchAgent install/uninstall with
  Linux/Windows stubs. Captures sys.executable at install time
  so venv-based runs survive a reboot. Opt-in via wizard
  checkbox; toggle from drawer anytime.
- Dockerfile: config volume declared so setup persistence
  survives docker run without a -v flag.
- Docs: first-agent.md rewritten for clone→pip→run→wizard
  flow; old fork/sed/docker content moved to new
  customize-and-deploy.md guide.

Tests: 29 passing (7 new — setup marker lifecycle, preset
discovery, preset content shape, shipped starter presence).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Without this, a browser refresh after the marker was written
externally (POST /api/config/setup, or /api/config/reset-setup
from another tab) kept Gradio serving its initial visibility
snapshot — wizard visible even though setup is done, or vice
versa. app.load runs per page visit so visibility tracks
is_setup_complete() live.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
promote: dev → staging (WAF UA fix)
Cloudflare's managed WAF in front of api.proto-labs.ai (and likely
other gateways behind a default WAF config) blocks the OpenAI
Python SDK's `OpenAI/Python <ver>` User-Agent with a 403 "Your
request was blocked". /v1/models went through fine because the
gateway's model-list handler doesn't gate on UA the same way —
only /v1/chat/completions 403'd, which made this look like a key
or model-alias problem rather than what it actually was.

tools/lg_tools.py already sets a custom UA on its outbound httpx
fetches for exactly this reason; graph/llm.py had no equivalent,
so ChatOpenAI fell back to the SDK default. Threading the same
identifier through default_headers makes every protoAgent egress
present a consistent allowlisted UA.

Verified: product-director wizard → chat turn → 200 OK from
api.proto-labs.ai with the groq-llama-70b alias.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
promote: staging → main (WAF UA fix)
Critical — path traversal in preset loader (graph/config_io.py:
read_soul_preset):
  Inputs like "../secret" escaped config/soul-presets/ and read
  arbitrary .md files anywhere on disk. Resolve both the preset
  root and the candidate path and require the latter live inside
  the former before reading. 7 parametrized tests cover the
  malicious inputs I could think of (../, ../../, absolute paths,
  bare "..", mid-path ../../).

Major — YAML auth.token was non-functional for A2A bearer:
  register_a2a_routes captured _a2a_token at register time, so
  wizard-set tokens were ignored until process restart. Promoted
  _a2a_token to a module-level mutable holder (_A2A_TOKEN: list)
  that the closure reads on every request, added set_a2a_token()
  as the public mutator, and a new auth_token= arg to
  register_a2a_routes as the seed source (env still the fallback).
  server.py's reload path now calls set_a2a_token on every YAML
  change so the wizard → live bearer enforcement flow works with
  no restart — verified: fresh boot open → wizard token set →
  401 on wrong token / 200 on right → drawer clears token → open
  again.

Major — plist XML injection in autostart.py:
  Agent names containing <, >, & produced malformed plists (and
  could theoretically inject nodes). xml.sax.saxutils.escape()
  every interpolated string field before embedding.

Major — install_autostart defaulted to port 7870 regardless of
--port flag (autostart.py / server.py):
  Captured the active port in a module-level _active_port at
  _main() time and threaded it through both finish_setup's
  autostart sync and the drawer's toggle_autostart callback. The
  generated LaunchAgent now reboots on whatever port the operator
  launched with.

Minor — chat_ui polish:
  * Numeric fields (max_tokens, max_iterations, worker_max_turns)
    fall back to sensible defaults (4096/50/20) instead of 0 when
    cleared — validate_config_dict rejects zero, so "or 0" blocked
    legitimate saves with a confusing validation error.
  * _sync_visibility no longer aliases the sidebar output slot to
    wizard_pane when the sidebar is absent; split into two closures
    with matching output arities so Gradio doesn't receive duplicate
    updates to the same component.
  * Legacy load_provider_choices handler guards get_current_provider
    existence — KeyError risk when a fork provides get_provider_choices
    alone.

Nitpicks:
  * Remove unused _FIELD_MAP from config_io.py.
  * ASCII hyphen (U+002D) instead of en-dash (U+2013) in the
    temperature validation error.
  * Pin ruamel.yaml>=0.18 in Dockerfile to match requirements.txt.
  * Document the VOLUME anonymous-volume lifecycle and named-volume
    recommendation in the Dockerfile comment.

Not addressed (deliberate):
  * CodeRabbit flagged test_list_gateway_models_http_error as
    expecting httpx.ConnectError to be caught by except httpx.HTTPError
    — false positive, ConnectError → NetworkError → TransportError →
    RequestError → HTTPError, test already passes.
  * "Reuse config_io.read_soul() in graph/prompts.py" — kept the
    inline check to avoid introducing an import dependency from
    prompts.py (loaded early, widely used) into config_io.py.
  * "Use tuple form for @pytest.parametrize" — stylistic; comma-
    separated string works identically.

Test surface: 36 passing (7 new — the path-traversal parametrize set).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Major — atomic graph reload (server.py::_reload_langgraph_agent):
  Previously swapped _graph_config + set_a2a_token BEFORE calling
  create_agent_graph, so a failed build would leave the running
  _graph pinned to the OLD agent but reporting the NEW config and
  rotated bearer token. Now build first; commit config/auth/graph
  state only on success.

Major — rollback marker on failed first-run reload (finish_setup):
  mark_setup_complete fires before the reload so the graph compiles.
  If the reload fails, the marker stays and the next page load
  drops the user into chat with _graph=None and no obvious recovery
  path. finish_setup now reset_setup()s on reload failure, so the
  wizard returns for a retry.

Major — sanitize agent_name for plist path (autostart.py::_macos_label):
  Prior sanitization only lowercased + replaced spaces. `/` and `..`
  survived, so an agent name with path-traversal chars could target
  arbitrary paths relative to ~/Library/LaunchAgents/ on install /
  status / uninstall. Strip input to [a-z0-9_.-] and fall back to
  "protoagent" when the result is empty/dots-only. Verified resolved
  plist path stays inside LaunchAgents/ for every path-traversal
  payload I could think of.

Major — gateway api_key off query string (POST /api/config/models):
  GET with ?api_key=... leaks credentials into browser history,
  reverse-proxy access logs, and uvicorn's own request log. Switched
  to POST taking a small ModelsProbeRequest body. Empty body still
  falls back to stored config so the drawer's initial render works.

Major — round-trip identity/security/autostart through the drawer
(chat_ui.py + server.py):
  Drawer previously only edited model/worker/middleware/knowledge/
  SOUL, leaving the wizard's agent name, operator, bearer token,
  and autostart flag with no post-setup edit path. Added three new
  accordion sections (Identity / Security / Autostart), wired them
  into _config_components, _load_all, and _save.

  Extracted _sync_autostart_with_config so the wizard's finish_setup
  and the drawer's save_all both drive the LaunchAgent install/
  uninstall from the same code path — flipping the drawer's
  Autostart checkbox + Save & Reload now installs/removes the plist
  the same way the wizard does.

Verified end-to-end on product-director:
  * Fresh clone → wizard → "pd-renamed" via drawer → agent card
    says pd-renamed, old bearer token → 401, new bearer token → 200
  * Invalid temperature (99) → rejected at validation; YAML +
    marker untouched
  * Path traversal via /api/config/presets/../secret → 404

Tests: 36 passing (no new cases — existing coverage was already
sufficient for these fixes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(ui): first-run setup wizard + live-edit config drawer
The template now ships a working memory loop and end-to-end eval suite
on day one so forks have a green baseline before they touch a single
line of code. Closes #154.

What lands:

- knowledge/store.py — sqlite + FTS5 (LIKE fallback). One ``chunks``
  table backs operator notes (memory_ingest), daily-log entries
  (daily_log), and conversation findings extracted by
  MemoryMiddleware (domain='finding'). Path resolves env >
  config > default with an automatic ~/.protoagent/ fallback when
  /sandbox isn't writable.

- tools/lg_tools.py — five new memory tools (memory_ingest,
  memory_recall, memory_list, memory_stats, daily_log) bound to the
  store via a closure factory so tests get a fresh store per run.
  ``echo`` removed; ``get_all_tools(knowledge_store)`` actually uses
  its parameter now.

- server.py — _build_knowledge_store() constructs the store and
  threads it through both initial init and the drawer reload path.
  Defaults flipped: knowledge_middleware + memory_middleware now
  ON by default (config/langgraph-config.yaml + graph/config.py).

- evals/ — A2A client + runner + verify helpers + 15 starter cases
  (tasks.json) covering agent card discovery, bearer auth gating,
  abstention, every shipped tool, KB recall, a chained two-tool
  case, and KnowledgeMiddleware injection. Side-effect-verified:
  audit log + reply text + KB chunks all checked independently so
  hallucinated tool results get caught.

- docs/guides/evals.md — full how-to. README/TEMPLATE/configuration/
  starter-tools/first-agent updated to reflect the new defaults and
  the additional five memory tools.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Real bugs:
- evals/runner.py: teardown now runs in a finally block so seeded KB
  rows get cleaned up even when the verifier or client.ask() raises.
  expected_tools=[] now means "assert no tools fired" (was conflated
  with "no key" via the `or []` short-circuit, making the abstention
  case a no-op).
- evals/runner.py + tasks.json: added a `stream` runner kind so
  AgentClient.stream() is reachable from tasks.json — new
  streaming_status_updates case asserts the SSE event sequence.
- knowledge/store.py: PRAGMA journal_mode=WAL is now best-effort
  (read-only DBs no longer break _connect). FTS5 rebuild after
  schema install so an existing chunks table populated before FTS
  was added gets indexed. find_chunk_containing/delete_by_content
  reject empty/whitespace-only inputs to prevent LIKE '%%' wildcards
  from matching every row.

Hardening:
- tools/lg_tools.py: clamp memory_recall(k) to [1, 20] and
  memory_list(limit) to [1, 200] so the agent can't request
  arbitrarily large slices of the KB.

Doc cleanup:
- docs/guides/subagents.md: LangGraphConfig snippet had a stale
  "echo" reference; replaced with the new memory-tool list.
- docs/tutorials/first-tool.md: WORKER_CONFIG example now appends
  git_sha alongside the bundled defaults instead of replacing
  them and dropping the memory tools.
- docs/reference/starter-tools.md: "adding your own" snippet now
  preserves the conditional _build_memory_tools(knowledge_store)
  extension.
- tests/test_config_io.py: starter-tool contract assertion now
  also covers web_search.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- evals/runner.py: collapse redundant nested teardown guard into a
  single `if "teardown" in case:` (SIM102); remove now-unused
  `setup_applied` flag
- knowledge/store.py: use `datetime.UTC` alias (Python 3.11+, UP017)
- tools/lg_tools.py: add `-> list` return annotation to
  `_build_memory_tools` (ANN202); replace explicit loop with list
  comprehension in `memory_recall` (PERF401)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

https://claude.ai/code/session_01148o8ppbuQwuZBsVGTQWwQ
Real bugs:
- evals/runner.py: setup is now inside the try block so a partial
  setup failure (e.g. step 2 of 3 errors) still triggers the
  finally teardown — rows from steps that did succeed no longer
  leak into the next case. Was flagged as a duplicate from round 2.
- knowledge/store.py: LIKE patterns now escape % and _ via
  ESCAPE '\' on every clause that takes user input
  (find_chunk_containing, delete_by_content, _search_like). A
  query for "100%" or "hello_world" no longer silently matches
  every row containing "100" or any single character between
  "hello" and "world".
- knowledge/store.py: FTS5 MATCH tokens are now double-quoted via
  _fts_quote() so user-supplied query terms can't smuggle FTS5
  operators (column filters, prefix wildcards, NEAR, AND/OR/NOT).
  Defence in depth — the [\w']+ tokenizer already filters most
  special chars.

Hardening:
- evals/runner.py: the fixed 0.3s asyncio.sleep waiting for the
  audit log to flush is gone. _await_audit_assertion now polls
  every 50ms up to a 2s deadline and returns as soon as the
  assertion passes — exits early on success, only burns the full
  deadline when the tool genuinely never fired.
- evals/runner.py: _run_auth_check accepts case["headers"] so cases
  can override the default bearer-only header set and exercise
  X-API-Key auth scenarios (or both auths together).
- knowledge/store.py: per-method exception handlers broadened from
  sqlite3.OperationalError to sqlite3.DatabaseError. Catches
  IntegrityError, ProgrammingError, and corruption variants too
  without crashing the agent loop. _has_fts5 (probe) and _connect
  (connection-time errors only) keep the narrower OperationalError.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- evals/runner.py: use asyncio.get_running_loop() instead of the
  deprecated get_event_loop() inside the _await_audit_assertion coroutine
- evals/runner.py: prefix unused _entries return value with underscore
- evals/runner.py: use datetime.UTC alias (consistent with store.py),
  drop now-unused timezone import
- knowledge/store.py: broaden _get_db exception catch from
  OperationalError to DatabaseError so corrupt-DB errors are swallowed
  per the module's no-crash contract
- knowledge/store.py: replace log.error with log.exception in all three
  DatabaseError handlers (schema init, _get_db, add_chunk) so
  tracebacks appear in error logs

Co-Authored-By: Claude <claude@anthropic.com>

https://claude.ai/code/session_01YW5U6mtpLy4rzKmqd4trkH
…ness

feat: ship default knowledge store + eval harness for forks
Adds a default scheduler so agents can defer work to themselves —
"remind me tomorrow", recurring sweeps, deadline check-ins. Three
new tools land in get_all_tools() when a backend is wired up:
schedule_task, list_schedules, cancel_schedule.

Two backends ship behind a single SchedulerBackend protocol:

- LocalScheduler (default): sqlite + asyncio polling. Per-agent
  jobs.db at /sandbox/scheduler/<agent_name>/ with a
  ~/.protoagent/scheduler/<agent_name>/ fallback. Fires by POSTing
  message/send to the running agent's own /a2a endpoint, going
  through bearer + X-API-Key auth like a real caller (audit log +
  cost-v1 capture work the same). Cron expressions reschedule via
  croniter; ISO datetimes are one-shot. Missed-fire recovery:
  within 24h fires immediately, older fires roll forward without
  firing.

- WorkstaceanScheduler: HTTP adapter to a Workstacean install's
  POST /publish. Activated automatically when
  WORKSTACEAN_API_BASE and WORKSTACEAN_API_KEY env vars are set.
  Topic and job IDs are namespaced cron.<agent_name>.<job_id>
  so a single Workstacean can serve N protoAgent forks safely.

Multi-agent isolation is the headline architectural property —
spinning up gina-personal alongside gina-work on the same box (or
sharing one Workstacean) won't cross-fire scheduled prompts.
Verified with explicit tests in test_scheduler_local.py.

Wiring:
- scheduler/{__init__,interface,local,workstacean}.py — module
- tools/lg_tools.py — _build_scheduler_tools factory; get_all_tools
  takes a new optional scheduler= kwarg
- graph/agent.py — create_agent_graph and create_simple_agent
  accept scheduler=
- server.py — _build_scheduler() picks backend at boot,
  on_event("startup"/"shutdown") drives the polling task lifecycle,
  reload path reuses the running scheduler instance
- config/langgraph-config.yaml + graph/{config,subagents/config}.py
  — worker subagent gets the three new tools in its allowlist
- requirements.txt — croniter>=2.0

Tests: 48 new (test_scheduler_local.py covers add/list/cancel,
multi-agent isolation, reschedule-vs-delete, missed-fire recovery,
and an end-to-end fire path with httpx mocked; test_scheduler_workstacean.py
covers all the publish payload assertions, namespacing,
custom topic prefix, and HTTP error handling).

Docs: docs/guides/scheduler.md (Diataxis how-to with the firing
model, multi-agent story, env reference, and notes on the
Workstacean A2A-bridge gap), plus index/configuration/README/
TEMPLATE updates.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- docs/guides/scheduler.md: replace jargon "multiple ginas" with "multiple agents"
- scheduler/__init__.py: sort __all__ lexicographically (RUF022)
- scheduler/local.py: log.error → log.exception in _init_db to preserve traceback (TRY400)
- scheduler/workstacean.py: correct stale module docstring that claimed list_jobs()
  issues a list command — it returns [] unconditionally
- server.py: add -> None return annotations to _scheduler_startup/_scheduler_shutdown (ANN202)
- tests: add match= to two bare pytest.raises(ValueError) calls (PT011)
- tools/lg_tools.py: wrap blocking scheduler calls in asyncio.to_thread() to avoid
  blocking the event loop under concurrent load; fix cancel_schedule error message to
  not conflate transport/DB failures with "no such job"

Co-Authored-By: claude-code <claude@anthropic.com>

https://claude.ai/code/session_01JmFYJSYRMRndZ43g3AYW2q
Real bugs:
- scheduler/local.py: _fire() now returns bool (True on 2xx, False
  on HTTP error or network exception). _tick() only reschedules /
  deletes when _fire() succeeds, so a transient failure leaves the
  job in place for the next tick to retry. Previously a one-shot
  job hit by a 5xx silently vanished.
- server.py: the API key env var name now uses AGENT_NAME_ENV.upper()
  to match the auth handler at L893. The previous code read
  agent_name() (which returns the wizard-set identity.name when set),
  so a wizard rename pointed the scheduler at <RENAMED>_API_KEY while
  the auth handler still expected <ENV>_API_KEY → self-invocation
  401'd silently after every wizard rename.
- server.py: reload path now constructs a scheduler when _scheduler
  is None (first-run case: server boots pre-setup, wizard finishes,
  drawer triggers reload — this is when we *first* construct the
  scheduler). Existing instances are still reused — drawer saves
  don't tear down the polling loop.

Surface:
- tools/lg_tools.py: exported SCHEDULER_TOOL_NAMES and MEMORY_TOOL_NAMES
  as module constants.
- graph/config_io.py::list_available_tools: now exposes scheduler +
  memory tool names to the wizard's checkbox group even when the
  runtime hasn't yet constructed the underlying backends. Otherwise
  the wizard would hide tools that the runtime registers as soon as
  the user finishes setup.

Declined:
- scheduler/local.py L141-149: CodeRabbit asked to re-raise
  sqlite3.DatabaseError from _init_db. The store is intentionally
  fail-soft (matches knowledge/store.py + audit.py): _resolve_db_path
  already falls back to ~/.protoagent/scheduler/ when /sandbox is
  unwritable, and re-raising would crash boot in exactly the scenario
  the fallback is designed to handle. The graceful degradation
  contract is "scheduler tools return errors when storage is broken,
  agent keeps serving everything else".

Tests:
- tests/test_scheduler_local.py: new test_fire_failure_leaves_job_in_place
  regression guard + test_fire_returns_bool contract test.
- tests/test_config_io.py: list_available_tools assertions now check
  for memory + scheduler tools and no duplicates.

86 scheduler-scope tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…h knowledge/memory)

Scheduler had asymmetric opt-out — only env-based (SCHEDULER_DISABLED=1).
The knowledge and memory subsystems already exposed YAML toggles
(middleware.knowledge, middleware.memory) so forks could flip them
through the drawer or wizard. Scheduler now matches:

- LangGraphConfig.scheduler_enabled: bool = True (default-on)
- from_yaml() reads middleware.scheduler
- config_to_dict() emits it for the drawer round-trip
- config/langgraph-config.yaml has middleware.scheduler: true
- server.py::_build_scheduler honors the YAML toggle first, env second

Both subsystems are now genuinely opt-out:

  middleware:
    knowledge: true       # was already so
    memory: true          # was already so
    scheduler: true       # NEW — was env-only
    audit: true

Drawer/wizard can flip any of them without restart (the existing
reload path already rebuilds on config change). The env opt-out
(SCHEDULER_DISABLED=1) stays as a runtime escape hatch for fleet
operators who can't edit YAML in the moment.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Real bugs:
- scheduler/local.py: stop() now suppresses only asyncio.CancelledError
  (the expected outcome of cancelling our own task) and logs any
  other exception via log.exception. Previously every exception
  was silently swallowed, so a polling-loop crash during shutdown
  would vanish without a trace.
- server.py: reload path now honors the new middleware.scheduler
  toggle. Three states:
  - flipped OFF (was on)  → stop + drop the running scheduler;
    new graph builds with scheduler=None.
  - flipped ON (was off / first run) → construct + start.
  - unchanged              → reuse the running instance.
  Helpers _start_scheduler_async / _stop_scheduler_async fire
  start()/stop() onto the active loop without forcing the entire
  reload chain to become async.

Type / nits:
- server.py: added `-> "SchedulerBackend | None"` return type to
  _build_scheduler, with a TYPE_CHECKING import to avoid runtime
  cycles.
- tests/test_scheduler_local.py: raw-string regex for `|`
  alternation (test_malformed_raises); added match= to the two
  bare ValueError tests (test_empty_prompt_rejected,
  test_malformed_schedule_rejected) so they only pass for the
  intended error message.
- tests/test_config_io.py: assert list_schedules in names alongside
  schedule_task / cancel_schedule.
- docs/reference/configuration.md: clarified the scheduler opt-out
  description — middleware.scheduler is canonical, SCHEDULER_DISABLED
  is a runtime escape hatch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Real bugs:
- scheduler/local.py::_fire(): metadata moved from
  params.message.metadata to params.metadata. The A2A handler
  reads custom metadata from params.metadata
  (a2a_handler.py L1244 — `msg_metadata = params.get("metadata")`),
  so the previous nesting silently dropped the
  scheduler_job_id / scheduler_kind breadcrumb. Observers now get
  it as intended.
- server.py reload path: scheduler swap is now planned before the
  graph rebuild and only committed after rebuild succeeds. A
  failed graph rebuild used to leave the scheduler torn down or a
  fresh one already started, dis-aligning runtime state. The new
  ordering: build candidate, rebuild graph (rollback-safe on
  failure), commit graph + scheduler atomically.
- scheduler/local.py: _resolve_db_path now sanitizes agent_name
  via a new _safe_segment() helper. Strips path separators, ``..``,
  and absolute-path prefixes; falls back to "default" when nothing
  usable remains. Defence in depth — the value comes from operator-
  controlled env / YAML, but a typo or copy-paste shouldn't be able
  to put a sqlite file outside the configured scheduler dir.

Tests:
- tests/test_scheduler_local.py::test_cron_rescheduled_after_fire:
  pinned to a fixed fired_at timestamp so the assertion is exact
  (next_fire == "2026-04-29T09:00:00+00:00") instead of a
  "different from original" near-tautology that depends on
  datetime.now().

Docs:
- docs/reference/configuration.md: clarified that the scheduler's
  enable/disable lives in YAML (middleware.scheduler), while
  backend selection and runtime knobs are env-driven. Repositioned
  SCHEDULER_DISABLED as the runtime escape hatch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat: pluggable scheduler (local sqlite + Workstacean adapter)
mabry1985 and others added 3 commits April 27, 2026 20:58
After PRs #155 (default KB store + memory tools) and #156 (default
scheduler), the docs claimed nine tools, missed scheduler tools
entirely in the reference, and skipped scheduler env vars. This pass
syncs every stale claim flagged by the audit.

Updates:

- docs/reference/starter-tools.md
  - Corrected count: nine → twelve
  - New tool sections: schedule_task, list_schedules, cancel_schedule
    (signatures, output formats, multi-agent isolation notes)
  - "adding your own" snippet now threads scheduler= through
    get_all_tools alongside knowledge_store=
  - Related links include the scheduler guide

- docs/reference/environment-variables.md
  - New "Knowledge store" section: KNOWLEDGE_DB_PATH override + the
    ~/.protoagent fallback
  - New "Audit log" section: AUDIT_PATH (used by evals/verify.py)
  - New "Scheduler" section: WORKSTACEAN_API_BASE/KEY/TOPIC_PREFIX,
    SCHEDULER_DB_DIR/INVOKE_URL/DISABLED, plus the protoLabs
    operators callout pointing at the ava node + secrets manager
    for the actual key

- docs/tutorials/first-agent.md
  - Wizard description now mentions all twelve tools and the four
    middleware toggles (added Scheduler alongside Audit/Memory/
    Knowledge)

- docs/tutorials/first-tool.md
  - "Where to go next" link copy: five → twelve

- docs/guides/fork-the-template.md
  - Tool list paragraph corrected to all twelve, with the
    binding-by-backend split called out

- docs/guides/customize-and-deploy.md
  - "Add domain tools" section now mentions memory + scheduler tool
    binding and the middleware.* toggles for opt-out

- README.md
  - Starter tools row now lists all twelve, grouped 4+5+3 with
    backend bindings shown

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…uler

docs: sync after KB store + scheduler PRs (#155 / #156)
…line)

Scaffolds .beads/issues.jsonl + .automaker/settings.json + standard .gitignore
(narrowing any blanket ignore) via release-tools init-workspace-config. Any
remaining runner-rule errors need per-job migration to
namespace-profile-protolabs-linux.
Copy link
Copy Markdown
Contributor Author

@mabry1985 mabry1985 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QA Audit — PR #161 | chore: conform to workspace-config standard (.beads + .automaker)

VERDICT: PASS


CI Status

  • No CI checks configured for this PR (acceptable for config-only scaffold changes)

Diff Review

  • Adds .automaker/settings.json — minimal {"version": 1} baseline conforming to workspace-config standard.
  • Updates .gitignore — replaces blanket /.automaker/ ignore with 4 targeted entries. Correct trade-off: runtime subdirs (features/, checkpoints/, trajectory/, beads.db) remain ignored while the settings config can be committed.

Observations

  • LOW: No CI checks on PR. Acceptable for infrastructure/config-only changes; no behavior change to test.
  • LOW: release-tools#18 tracks runner-rule errors separately — file-level verification errors are cleared here.

Safety rail — Quinn cannot self-approve. This PR appears to be from the same deployment context as Quinn. Formal review is COMMENTED (PASS) rather than APPROVED. Please have a human reviewer confirm and merge manually.

Checks: 1
Passed: 1 (diff review)
Failed: 0
Gaps: 0

— Quinn, QA Engineer

@mabry1985
Copy link
Copy Markdown
Contributor Author

Submitted COMMENT review on protoLabsAI/protoAgent#161. The diff is clean — .automaker/settings.json baseline and targeted .gitignore entries — and the PR title matches the changes exactly. Formal state is COMMENTED rather than APPROVED per the self-review safety rail; a human reviewer should confirm and merge.

Migrate plain-Linux jobs to namespace-profile-protolabs-linux; annotate
jobs that genuinely require GitHub-hosted infra.

MIGRATED:
- docs.yml build (vitepress docs build + upload-pages-artifact)
- prepare-release.yml prepare (python version bump + gh PR ops)

ANNOTATED (allow-hosted-runner):
- docker-publish.yml build-and-push — docker buildx + registry push
- docs.yml deploy — GitHub Pages deploy needs hosted Pages environment
- release.yml release — docker buildx + registry push (+ OIDC attestation)

Also finishes the .beads scaffold that the conform commit referenced but
never landed: narrow the blanket *.jsonl ignore so .beads/issues.jsonl is
committed (workspace-config requires the git-friendly export tracked).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor Author

@mabry1985 mabry1985 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QA Audit — PR #161 | chore: conform to workspace-config standard (.beads + .automaker)

VERDICT: PASS (filed as COMMENT — GitHub rejected APPROVE as self-review; human approval required to merge)


CI Status

  • No checks reported (head ccd6656) — expected: chore touches .automaker/, .beads/, .gitignore, and workflow YAML; existing workflows are path-filtered or tag/push-triggered and don't fire on this diff.

Diff Review

  • Scaffolds baseline: new .automaker/settings.json ({"version":1}) and empty .beads/issues.jsonl placeholder.
  • .gitignore correctly narrowed: removes blanket .automaker/ ignore, adds targeted .automaker/{features,checkpoints,trajectory}/ + .beads/beads.db, preserves !.beads/issues.jsonl negation so the git-friendly export stays committable.
  • Workflow runner standardization: docs.yml build and prepare-release.yml prepare switch to namespace-profile-protolabs-linux; jobs that legitimately need GitHub-hosted runners (docker registry push in docker-publish.yml/release.yml, GitHub Pages deploy in docs.yml) keep ubuntu-latest with explicit # workspace-config: allow-hosted-runner annotations explaining why.
  • Remainder is quote-style normalization (single→double) — cosmetic, no behavior change.

Observations

  • BLOCKING (process): GitHub rejected automated APPROVE — PR author matches reviewer identity. Needs a human approver before merge.
  • LOW: !.beads/issues.jsonl negation references a blanket *.jsonl rule not visible in the diff context; if no such rule exists elsewhere it's harmless/defensive.
  • LOW: Runner switch assumes namespace-profile-protolabs-linux is provisioned for this repo — consistent with workspace-config standard but unverified from the diff alone.
  • LOW: No CodeRabbit findings; clawpatch skipped (protoAgent not in mounted set; diff is config-only).
  • Runner-rule errors deferred to release-tools#18 per PR description — appropriate scope split.

— Quinn, QA Engineer

@mabry1985 mabry1985 changed the base branch from main to dev May 25, 2026 08:17
@mabry1985 mabry1985 merged commit 9399a2f into dev May 25, 2026
@mabry1985 mabry1985 deleted the chore/conform-workspace-config branch May 25, 2026 09:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants