Skip to content

Promote dev → main — OWLETTE-WEB-3R fix + web/dashboard batch (43 commits)#9

Merged
dylanroscover merged 43 commits into
mainfrom
dev
May 25, 2026
Merged

Promote dev → main — OWLETTE-WEB-3R fix + web/dashboard batch (43 commits)#9
dylanroscover merged 43 commits into
mainfrom
dev

Conversation

@dylanroscover
Copy link
Copy Markdown
Member

Summary

Promotes 43 commits from dev to main (prod / owlette.app), all currently live and verified on dev (dev.owlette.app). Headlined by the fix for Sentry OWLETTE-WEB-3R (dashboard permission-denied for non-superadmins), plus the API-key-creation fix, the cortex tier-3 tool-approval gate, the logs/metrics work, and CLI pre-publish hardening.

Full change list: the ## [Unreleased] section in docs/changelog.md / web/content/docs/changelog.mdx.

✅ Pre-merge deploy step — already DONE

Firestore composite indexes were deployed to prod ahead of this merge (additive, 18 → 23 indexes):

firebase deploy --only firestore:indexes --project prod   # Deploy complete, verified

They back the new logs filter combinations and the /api/users/deletions audit feed. No data migration. (Recorded here for the deploy log — no further action needed.)

Verification

  • Merge: conflict-freegit merge-tree origin/main dev clean, re-verified 3× (latest against tip 3f35be3). dev is current — not stale.
  • CI green on dev: lint, typecheck, unit (2457 passed), production build, and Playwright e2e (289 passed) on the latest commits.
  • 3-agent merge review → MERGE WITH NOTES, no blockers: no secrets/debug/scratch in the delta, firestore.rules unchanged, no new required env var (RESEND_FROM_EMAIL pre-provisioned + has a fallback), and the CI workflow changes trigger no unintended publish/deploy on merge (all tag-gated).
  • OWLETTE-WEB-3R fix proven by a de-whitelisted non-superadmin e2e spec.

Notes

  • No VERSION bump — the version number tracks the agent, and this batch contains no agent changes.
  • OpenAPI route-coverage gap for /api/users/deletions is closed in this batch.

🤖 Generated with Claude Code

dylanroscover and others added 30 commits May 22, 2026 14:34
All transactional/alert emails sent via the shared FROM_EMAIL constant
showed the bare address (noreply@mail.owlette.app) in recipients' inboxes.
Compose the from header as "Owlette <addr>" so the friendly name appears,
while respecting an already-named RESEND_FROM_EMAIL value verbatim.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
versions route: deterministic content-addressed versionId (drop client-stamped createdAt); tri-state expectedCurrentVersionId CAS (string / explicit null = expect-empty / absent, 400 otherwise); 3-way publish transaction (no-op / promote-existing-version / create) with all reads before writes.

screenshot command-status route: mint result.screenshot_url from the agent's result.storage_path so 'owlette machine screenshot' completes.

Surfaced by the pre-publish CLI review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Remove the non-functional 'key' commands (key management is session/dashboard-only). Fix machine screenshot, chat send, roost push (CAS + idempotency + deterministic manifest), rollback, deploy, process, listen, trigger, audit-log, quota.

Cross-cutting: consistent exit codes (2 = local usage error), idempotency-key safety + lost-response surfacing on mutating commands, request timeouts on non-streaming calls, shell-safe browser open, shared http/output helpers.

~3 critical + ~30 major issues fixed; 256 unit tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tag-driven publish for @owlette/cli, @owlette/sdk (npm) and owlette-sdk (PyPI) via OIDC trusted publishing — no NPM_TOKEN. Dispatch inputs routed through env (no shell injection); real publishes require a tag matching the package version; dry-run default.

docs/api/distribution.md documents the flow and the npm first-publish bootstrap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a search box (expanding magnifying-glass button in the header) that
substring-matches across action, machine, process, level, and details.
Firestore has no full-text query, so search runs client-side over a
'search pool' — the full set of logs matching the active
date/level/machine/action filters, loaded on demand and capped at 2,000 —
falling back to on-screen rows until it lands. Header stats recount against
matches, infinite scroll pauses while searching, and a notice appears when
the scope exceeds the cap.

Also extracts shared buildLogsQuery/applyClientScope helpers (used by both
the live listener and the search-pool fetch) and fixes details-tooltip
clipping by wrapping long unbreakable strings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wraps the filters panel in a controlled Radix Collapsible reusing the
shared collapsible-down/up keyframes (globals.css) for a 200ms open /
180ms close, matching the log-row expand animation. Adds aria-expanded to
the toggle button.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lover

Outline/destructive buttons had no hover in dark mode: the resting
dark:bg-input/30 (and dark:bg-destructive/60) ties hover:bg-* on specificity
(both 0,2,0 under @custom-variant dark (&:is(.dark *))), and Tailwind emits
dark: after hover:, so the resting bg won and the hover bg never rendered.
Fixed by adding higher-specificity dark:hover:* (0,3,0), matching the
pattern ghost already used.

Standardizes the neutral hover on the secondary token to match the header
rollover (PageHeader hover:bg-secondary): outline + ghost now roll over to
bg-secondary everywhere; default (cyan) and destructive (red) keep their
color-based hovers. Removes the dead hover:bg-muted overrides on the logs
toolbar and makes clear-logs dark-hover-aware so it stays red.

Adds a Playwright regression test asserting outline toolbar buttons change
background on hover (would have failed in dark mode pre-fix). Amends
CLAUDE.md: ui/* are owned, customizable shadcn primitives — button.tsx is
the single source of truth for button styling.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
These per-instance hover classes duplicated what button.tsx now centralizes
(outline/ghost roll over to bg-secondary). Removed hover:bg-muted/bg-secondary/
hover:text-foreground overrides on outline + ghost buttons across login,
register, passkey, the dashboard + demo toolbars, AddMachineButton, the
download button, and the installer upload dialog. Kept intentional non-neutral
hovers (brand cyan, white-icon hover) and left non-Button element hovers alone.

Some of these (the add-machine copy buttons, the installer dialog's
hover:bg-secondary! override) were dead in dark mode due to the prior cascade
bug; relying on the base now renders them correctly. No visual change in light
mode (muted == secondary in this theme); dark-mode hover is now correct.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a --card-sunken token and reworks the machine cards into a clear
surface hierarchy: the card/list container recedes (bg-card-sunken) while
content stays at bg-card and reads as raised. Metrics and processes are now
single enclosed panels with hairline dividers (divide-border/60) instead of
floating cards over dark gaps — the process tree connector is dropped (the
enclosure already groups them under the machine). Sections align at px-6 so
left edges and the expand chevrons line up.

Borders unified: subtle /30 panel borders, full borders softened (icon
buttons/controls -> /50, card outer -> /60). Row hover is now a subtle
bg-secondary/25 fill (::after) instead of a square ring; removed the dead
getUsageRingClass. List view gets the same sunken container + light rows.
Demo toolbar -> bg-card-sunken.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The dashboard restored the metrics detail panel from the persisted
activeGraphPanel on every load. A restored panel often has stale/empty graph
tabs, so it rendered empty yet still reserved its slide of height — the
'gap between the welcome header and machines'. It now opens only from an
in-session metric click (panelOpenedThisSession gate) and never auto-restores
on load. Also aligns the toolbar segmented control to bg-card-sunken to match
the cards.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ection

Supersedes 0dc5237, which fixed the empty-slide 'gap' by disabling the
panel's auto-restore entirely — removing the persist-across-reloads feature.

Root cause (confirmed by two independent reviews): activeGraphPanel
(open-state) and graphTabs (content) are persisted independently and can
desync — a restored panel can resolve to zero renderable lines (deselect-all,
dropped metric ids, stale device ids) yet useSlidePanel still reserves the
panel's fixed height, leaving an empty slide above the grid.

Fix: restore the panel from prefs again, but only when its resolved selection
(deserializeTabs(graphTabs[machineId])) is non-empty; fall back to the
always-non-empty initialMetricToState default when there is no persisted
entry (legacy panels still restore); the display route has no graphTabs and is
always allowed; the machine-in-site guard is kept. As defense-in-depth,
MetricsDetailPanel now renders a 'no metrics selected' empty-state instead of a
blank fixed-height chart when zero lines are visible.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ign demo list surface

Review follow-ups: getUsageRingClass had zero call sites after the hover-fill
switch (dead export); the metrics enclosure comment still said 'prototype';
and the demo list view wrapper used bg-card instead of bg-card-sunken.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rework the detail-panel stat cards into short horizontal chips: metric
label on the left, an enclosed min/avg/max section floated to the right
(reordered from avg/max/min). Cards now size to content and wrap
(flex flex-wrap) instead of stretching across a 4-col grid, so a single
selected metric no longer spans the full width. Saves vertical space in
the graph view.

Also fix the metrics panel vanishing when every metric is toggled off:
the panel's open-state is now driven solely by activeGraphPanel, so
"clear all" empties the panel (showing the in-panel "no metrics selected"
state) rather than dismissing it. Only the X button closes it, and the
empty-but-open panel restores across reloads. Removes the now-unused
empty-selection collapse guard and the deserializeTabs import.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…anagement

- GET /api/users/activity: Firebase Auth last-seen (lastRefreshTime, falling
  back to lastSignInTime), superadmin-gated, getUsers batched <=100 and keyed
  by record.uid (unordered-safe)
- GET /api/users/deletions: audit-sourced feed of self (USER_SELF_DELETE) and
  admin (USER_DELETE) deletions from global/audit_log/entries
- useUserManagement: UserData += deletedAt/deletedBy; activity fetch keyed on a
  stable uidKey (non-fatal); getUserCounts excludes deleted + adds deleted count
- admin/users page: "last seen" column, admin-deleted rows de-emphasized with a
  "deleted" badge + gated actions, "account deletions" panel
- firestore index: entries(capability ASC, timestamp DESC) for the deletions query

Self-delete remains a hard delete (no tombstone) per the 6-agent plan review;
deleted users surface via the audit feed, not a resurrectable user doc.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…nt hook

The /api/users/activity fetch (superadmin-only) was living in
useUserManagement, but that hook is also used by ManageSitesDialog on
non-superadmin pages (roosts/dashboard/deployments/logs) — so the fetch
fired there and 403'd, and the browser logged that as a console error,
tripping the empty-roost-state e2e console-error guard. Moved the fetch to
the superadmin-only admin/users page (mirroring the deletions fetch); the
shared hook is back to user-list + mutations + deleted-aware counts only.

Also fix the user-mgmt e2e assertion to match the lowercased "you" pill.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Catch playwright-e2e failures before/right-after they hit an auto-deploying branch:
- /preflight command: lint + typecheck + unit + local e2e (the CI mirror) before pushing web changes
- post-push-e2e.mjs PostToolUse hook: after a dev/main push in e2e scope, reminds to watch the triggered CI run; never auto-fix-and-repush
- settings.json: register the hook next to post-push-installer
- CLAUDE.md: document the two-layer e2e verification flow

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ion-bump flow

Moves the agent installer build/upload steps and version-bump flow into the
build-system skill (out of CLAUDE.md, which now points here from the guardrails).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The cloud function moved metrics_history from daily (YYYY-MM-DD) to hourly (YYYY-MM-DD-HH) buckets and stopped writing the daily bucket, but the sparkline reader still queried only today's daily bucket — so the dashboard card background graphs went blank on prod (the detail-panel reader already read both shapes). Read the current + previous hour buckets plus today's daily fallback via a single documentId()-in query listener (one listener per machine, same as before), merge/dedupe by timestamp, keep the last 60 samples, and re-subscribe on the UTC hour boundary so an open tab doesn't freeze on a stale bucket. Centralize the daily/hourly bucket-id formatters + regexes in lib/metricsHistoryBuckets.ts (shared by both readers) so the writer/reader contract can't drift per-file again. Add a regression test covering the hourly-only path the e2e fixtures never exercised.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e gap

Make the dashboard + demo list view match the card view's container
surfaces, and fix a regression that pushed the machines list down.

- list container: border-border (full) -> /60, bg-card -> bg-card-sunken,
  rounded-lg -> rounded-xl, so its border + corners render identical to the
  shadcn Card. The dashboard renders its list view inline, so earlier
  MachineListView container edits never applied — that inline full-opacity
  border was the "brighter list border".
- add --card-header token (color-mix of --card-sunken + --background, a
  single :root definition that re-resolves per theme, so no .dark override
  to fall back to a light value) and apply it to the machine card header and
  the list table header so both read a gentle step darker than content.
- soften the list table-header row border from full opacity to /60.
- list process rows now sit in one raised bg-card enclosure on a sunken tray
  (mirrors the card view) instead of floating bordered cards; machine rows
  bg-card -> bg-card-sunken.
- drop the border on the expand/view-toggle pill (keeps its sunken fill).
- metrics detail panel no longer reserves an empty slide above the machines
  on load: an empty/stale persisted selection stays collapsed, while a live
  "clear all" still keeps an open panel open (panelOpenedThisSessionRef).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Grant MACHINE_REMOVE to SITE_ADMIN_CAPABILITIES. It is already site-scoped, so admins can remove machines only on sites they are assigned to; superadmins anywhere. Flip the two tests that encoded the old superadmin-only rule and update the removeMachine docstring.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
clearLogs gains an index-free date-window path (timestamp range server-side + in-memory action/machine/level match, cursor-paginated); the no-date path is unchanged. DELETE /api/sites/{siteId}/logs accepts since/until; OpenAPI + unit tests cover it. Fixes the footgun where a date-filtered view still cleared all logs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Set color-scheme: light on :root and dark on .dark (and directly on native temporal inputs) so the browser renders date pickers, their calendar icon, selects, and scrollbars in the active theme instead of always-light.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add react-day-picker + shadcn Calendar and a reusable DatePicker you can type into (canonical yyyy-mm-dd, with tolerant parsing) or pick from a Calendar popover — replacing the unthemeable native <input type=date>. ConfirmDialog gains an optional children slot.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…igned-column table

Swap native date inputs in the clear dialog + custom-range filter for the themed DatePicker; the clear dialog now has its own from/to window wired to since/until. Rebuild rows as an aligned grid (level, time, event, machine, process, details) with a header, compact relative time (absolute on hover/expand), details as the truncating flex column, and a card-header/card-sunken/card surface cascade matching the dashboard.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Drop the metrics + display panel cards to bg-card-sunken and put the chart/canvas/table content on bg-card so graphs and tables stand out for readability (matches the dashboard card-sunken body / card content cascade).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sibling of the screenshots harness: playwright.videos.config.ts + e2e/videos/ drive the dashboard at 1080p against the seeded demo fleet and record per-scene .webm via recordScene (fake cursor, narrate dwell, typewriter typing), reusing the screenshots fixtures. Includes the dashboard-tour example scene and npm run videos / videos:debug.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
13 dual-track episode scripts (GUI/installer focus; excludes API/CLI/SDK), the ElevenLabs voiceover generator (voiceover/generate.py), the pywinauto native-capture harness for installer/GUI/tray, production docs (README/outline/script-format), and the codex+claude accuracy review records. Workspace under dev/video-tutorials/.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
MachineListView was never rendered — the dashboard and demo both inline
their own list container and only import MachineRow, MachineTableHeader,
and MemoizedTableHeader from this file. That dead duplication is what hid
the list-vs-card border mismatch (container edits here never reached the
rendered dashboard). Drops the unused function + its props interface and
the now-unused imports (useMemo, Table, TableBody, unionIds, useDevicePrefs);
the exported row/header components and types are unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
GitHub force-migrates Node 20 actions to Node 24 on 2026-06-02 (flagged by
the e2e run's deprecation annotation). Bump across all workflows + the
composite action: actions/checkout@v4 -> v6, setup-node@v4 -> v6,
setup-java@v4 -> v5, actions/cache@v4 -> v5 (latest majors, all Node 24).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The aligned-column logs redesign (4f46eb8) showed the details preview in every row unconditionally. When a row is expanded the full details also render below, so the message text appeared twice — visually redundant, and ambiguous for tests asserting on the message. Restore the pre-redesign behaviour: the truncated preview (and its screenshot indicator) shows only while collapsed; the expanded section is the single source of the full text.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
dylanroscover and others added 13 commits May 25, 2026 00:52
Privileged tier-3 Cortex tools (run_powershell, execute_script, reboot, etc.)
now pause for explicit in-chat approval before running, via the AI SDK v6 native
approval API (needsApproval + addToolApprovalResponse + sendAutomaticallyWhen).
Tier 1/2 keep auto-running; autonomous Cortex (separate buildAutonomousTools) is
intentionally not gated.

Server:
- buildExecutableTools marks tier-3 tools needsApproval
- per-site flag sites/{siteId}/settings/cortex.requireTier3Approval (default on)
  via getCortexRequireTier3Approval + setCortexRequireTier3Approval action +
  admin-gated PATCH /api/sites/{siteId}/cortex-settings
- /api/cortex migrated to the UIMessage protocol + convertToModelMessages so the
  approval round-trip carries the pending tool call + decision back to resume;
  local Cortex is skipped when approval is required so the gate can fire
  (mirrored in cortexStream.server.ts)

Client/UI:
- approval card (approve/deny) in ToolCallCard/ChatWindow; output-error renders
  as a failed card instead of green success
- per-site admin approval toggle (CortexApprovalToggle) + read hook

Sidebar/UX (from a 10-agent review):
- hoist CortexChatView into a persistent app/cortex/layout.tsx so navigating
  /cortex <-> /cortex/{id} no longer remounts and resets sidebar/collapse state
  or drops the optimistic new-chat row; routing refs reworked accordingly
- persist sidebar open + per-category collapse to devicePrefs (useCortexSidebarPrefs)
- active conversation auto-expands its group + scrolls into view; collapsed group
  holding the active chat is flagged; collapse-all label/action desync fixed
- conversation rows made keyboard/SR accessible; button hover fixes; hexagon
  neuron loader with reduced-motion fallback

Tests added for tier-3 needsApproval mapping, the approval-flag default/fail-safe,
and the setCortexRequireTier3Approval action. Lint, tsc, full jest suite, and
production build all clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The date-scoped "clear logs" built its since/until window from
new Date(d.getFullYear(), ...), i.e. the browser's local timezone, while
logs are displayed in the site/display timezone (formatSiteScopedTimestamp).
A cross-timezone admin clearing "May 25" on a remote-site would over-delete
into the adjacent day and miss part of the intended day — a destructive
correctness bug in Owlette's core cross-TZ remote-management scenario.

Add zonedTimeToUtcMs (timeUtils) — calendar components in an IANA zone →
UTC ms via offset correction (DST- and half-hour-offset-correct, independent
of the browser TZ) — and resolve the clear bounds in the same timezone the
rows render in. Day-end stays 23:59:59.999 inclusive to match the server's
`timestamp <= until`. Adds a TZ-independent unit test for the helper.

Found by the dev→prod readiness review (P2, blocking).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The versions POST returned `noop` when republishing bytes already at the
current head — but it returned *before* applying the supplied name/targets/
extractPath, contradicting the code's own "each deploy is an explicit
re-statement of intent." Re-deploying identical content to a different
machine set silently dropped the target change.

In the same-head branch, apply any *explicitly-provided* deploy fields via a
merge write to the roost doc (no versionCounter bump, no version-doc rewrite,
CAS/immutability preserved). Omitted fields are untouched, so a genuine no-op
still writes nothing — the existing no-op test is unchanged. Adds a
regression test for the restate-targets-at-head path.

Found by the dev→prod readiness review (P4).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… buttons

The keyboard-accessibility change in 2b9f992 put role="button" + tabIndex on
the conversation-row <div>, but that row contains the rename/delete <button>s
— interactive controls nested inside an interactive control, which axe flags
as a serious `nested-interactive` violation and failed the cortex a11y gate.

Restructure: the row is a plain container; its open-conversation control is a
real <button> (native keyboard + SR support, focus-visible ring) with the
rename/delete buttons as SIBLINGS, not nested. Keyboard access is preserved
and axe is satisfied. Visually unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
buildLogsQuery dropped orderBy('timestamp') whenever a non-date filter
(action/machine/level) was active, so Firestore returned a random
__name__-ordered slice of matching logs rather than the most recent
ones (log doc IDs are random UUIDs). Filtering by machine surfaced
arbitrary old events with mismatched warning/error counts — e.g. a
single-machine view showing more warnings/errors than the unfiltered
view.

Always order by timestamp desc and resolve every filter + date range
server-side, backed by four new level composite indexes (level,
action+level, machineId+level, action+machineId+level). Drop the
applyClientScope re-sort hack and refactor loadMore to reuse
buildLogsQuery so pagination can no longer drift from the initial query.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The account settings → api tab posted to /api/account/api-keys, which is
gated superadmin-only via the GLOBAL_SETTINGS_WRITE capability, so a
non-superadmin got a 403 ("superadmin access required") when creating a key.
Repoint the tab to the user-scoped /api/keys endpoints (create/list/revoke)
and add an explicit scope-preset selector (default publisher) so the key's
scope is visible instead of a hidden full-access grant. Error handling now
reads the RFC-7807 `detail` field these routes emit, and a name is trimmed
before the fallback so a whitespace-only name no longer 400s.

Also centralize the scope-preset key list + descriptions in apiKeyTypes.ts so
AccountSettingsDialog and CreateKeyDialog share one source of truth (removes
duplicated copy that had already drifted in wording).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The read/write throughput stack in the dashboard list view sat immediately after the usage/capacity text, so its horizontal position drifted with the capacity-string width and never lined up across machine rows. Anchor it to the cell's right edge with ml-auto; the r/w values stay left-aligned within their own container.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…creenshots

Two review-found gaps in the tier-3 approval / chat work:

- The per-site approval flag only gated local-Cortex *routing*; needsApproval
  was unconditional, so "approval off" still prompted on the server-side
  fallback and never consulted the flag at all in site-wide mode. Thread
  requireTier3Approval into buildExecutableTools (needsApproval = tier>=3 &&
  flag, default-true) and read it on every server-side/site-wide build site.
- /api/cortex converted UIMessages to ModelMessages before tools were built,
  dropping capture_screenshot's toModelOutput image projection on follow-up
  turns; build tools first, then convertToModelMessages(messages, { tools }).
- capture_screenshot's toModelOutput only read a top-level url, so site-wide
  aggregated output ({ machines: [...] }) never projected per-machine images —
  now each machine's screenshot is projected.

Autonomous Cortex (separate buildAutonomousTools) and tier-1/2 callers are
unaffected. Adds unit tests for the toggle-off gate and the screenshot
projection (single + site-wide + no-url fallback).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A same-version-at-head republish that restates name/targets/extractPath applies
them but returned outcome 'noop' and emitted no audit. Return configApplied and
emit roost_mutated (verb config_update) when config was actually written; a true
no-op still emits nothing.

Review-found follow-up: that config write skipped the expectedCurrentVersionId
CAS check (it returned before the optimistic-concurrency guard the promote/
create paths enforce). Now the config-only write also rejects a stale expected
head with 412. Adds tests for the config_update emit and the CAS rejection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The py-sdk publish workflow built, twine-checked, and published on tag without
ever running the pytest suite (sdks/python defines pytest dev deps + testpaths).
Add a Run tests step (pip install -e ".[dev]" + python -m pytest) after Setup
Python, before Build, so a failing suite blocks publish on every path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
useUserManagement opened an onSnapshot over the entire users collection
unconditionally. firestore.rules correctly restricts that collection to
superadmins, so every non-superadmin tripped permission-denied on page
load. The hook is mounted via ManageSitesDialog, which renders on the
dashboard, roosts, logs, and deployments pages, so the denial fired
across all four (Sentry OWLETTE-WEB-3R captured it on /dashboard).

Gate the listener behind a required `enabled` param and pass
Boolean(isSuperadmin) from the callers (ManageSitesDialog, admin/users).
Rules are unchanged - loosening them would leak every user's email,
role, and site assignments. Drop the e2e console-noise whitelist that
had been masking this exact error so it cannot silently regress.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… from minting api keys

Three calibrated follow-ups from the OWLETTE-WEB-3R review (none are page-load
auto-fire; all are client-interaction or defense-in-depth):

- CreateSiteDialog: a client read of sites/{id} is denied both when the site
  does not exist (available) and when it is owned by another user (taken), so
  the dialog cannot distinguish them. Keep the optimistic-available behavior
  (POST /api/sites is authoritative and returns 409 on a real collision) but
  stop console.error-ing the expected permission-denied, which was Sentry noise
  on every check of a not-yet-existing id. No global existence-check endpoint is
  added on purpose: it would enable site-id enumeration the rules prevent.
- useCortex: permission-denied when opening a /cortex/{id} URL for a chat the
  user does not own is expected; surface as not_found without logging noise.
- POST /api/keys: reject role='agent' ID tokens via an opt-in rejectAgentTokens
  flag on requireSessionOrIdToken (default off, so other routes are unaffected).
  Agents authenticate for site/machine ops and must never mint user-scoped keys.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ed web changes

- openapi.yaml: add GET /api/users/deletions (superadmin audit feed: limit
  query param, deletions[] response, user:read / session-superadmin security),
  closing the route-coverage validator warning.
- changelog: add an [Unreleased] section to both docs/changelog.md and the
  fumadocs-rendered web/content/docs/changelog.mdx capturing the web/dashboard
  batch shipped to prod since 2.12.3. No VERSION bump — the version number
  tracks the agent and this batch has no agent changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 25, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
owlette Ignored Ignored May 25, 2026 10:47pm

@dylanroscover dylanroscover merged commit 72dfd29 into main May 25, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant