chore: bump the all-minor-patch group with 4 updates #68

Workflow file for this run

	name: Claude Code

	on:
	issue_comment:
	types: [created]
	issues:
	types: [opened, assigned]
	pull_request_review_comment:
	types: [created]

	jobs:
	claude:
	if: \|
	(
	(github.event_name == 'issue_comment' \|\| github.event_name == 'pull_request_review_comment') &&
	contains(github.event.comment.body, '@frontend-claude') &&
	contains(fromJson('["OWNER", "MEMBER", "COLLABORATOR"]'), github.event.comment.author_association)
	) \|\|
	(
	github.event_name == 'issues' &&
	(contains(github.event.issue.body, '@frontend-claude') \|\| contains(github.event.issue.title, '@frontend-claude')) &&
	contains(fromJson('["OWNER", "MEMBER", "COLLABORATOR"]'), github.event.issue.author_association)
	)
	runs-on: ubuntu-latest
	env:
	GITHUB_TOKEN: ${{ secrets.PAT }}
	BLOB_READ_WRITE_TOKEN: ${{ secrets.CLAUDE_BLOB_READ_WRITE_TOKEN }}
	VERCEL_GIT_COMMIT_REF: claude/${{ github.ref_name }}
	permissions:
	contents: write
	pull-requests: write
	issues: write
	actions: read

	steps:
	- name: Harden runner
	uses: step-security/harden-runner@fa2e9d605c4eeb9fcad4c99c224cee0c6c7f3594 # v2.16.0
	with:
	egress-policy: audit

	- name: Checkout repository
	uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
	with:
	fetch-depth: 0
	token: ${{ secrets.PAT }}

	- name: Setup pnpm
	uses: pnpm/action-setup@a15d269cd4658e1107c09f1fabf4cbd7bd1f308a # v4.4.0

	- name: Setup Node.js
	uses: actions/setup-node@53b83947a5a98c8d113130e565377fae1a50d02f # v6.3.0
	with:
	node-version: '24'
	cache: 'pnpm'

	- name: Cache Playwright browsers
	id: playwright-cache
	uses: actions/cache@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
	with:
	path: ~/.cache/ms-playwright
	key: playwright-${{ runner.os }}-${{ hashFiles('pnpm-lock.yaml') }}
	restore-keys: \|
	playwright-${{ runner.os }}-

	- name: Install dependencies
	run: pnpm install --frozen-lockfile

	- name: Install Playwright browsers and dependencies
	if: steps.playwright-cache.outputs.cache-hit != 'true'
	run: npx -y playwright install --with-deps chromium

	- name: Install Playwright system dependencies
	if: steps.playwright-cache.outputs.cache-hit == 'true'
	run: npx -y playwright install-deps chromium

	- name: Cache Cypress binary
	id: cypress-cache
	uses: actions/cache@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
	with:
	path: ~/.cache/Cypress
	key: cypress-${{ runner.os }}-${{ hashFiles('pnpm-lock.yaml') }}
	restore-keys: \|
	cypress-${{ runner.os }}-

	- name: Install Cypress binary
	if: steps.cypress-cache.outputs.cache-hit != 'true'
	run: pnpm --filter @semianalysisai/inferencex-app exec cypress install

	- name: Start dev server
	id: devserver
	continue-on-error: true
	run: \|
	set -euo pipefail

	LOG=/tmp/next-dev.log
	echo "log=$LOG" >> "$GITHUB_OUTPUT"

	pnpm run dev -- --hostname 0.0.0.0 --port 3000 > "$LOG" 2>&1 &
	DEV_PID=$!
	echo "pid=$DEV_PID" >> "$GITHUB_OUTPUT"

	for i in {1..60}; do
	if curl -sSf http://localhost:3000 >/dev/null; then
	echo "Dev server is up"
	echo "up=true" >> "$GITHUB_OUTPUT"
	exit 0
	fi

	# If process died, stop waiting early
	if ! kill -0 "$DEV_PID" 2>/dev/null; then
	echo "Dev server process exited early"
	break
	fi

	sleep 2
	done

	echo "Dev server failed to start (best effort; continuing)."
	echo "up=false" >> "$GITHUB_OUTPUT"
	tail -n 200 "$LOG" \|\| true

	# Avoid leaving a stuck process around holding the port
	kill "$DEV_PID" 2>/dev/null \|\| true

	exit 0

	- name: Run Claude Code
	id: claude
	if: ${{ always() }}
	uses: anthropics/claude-code-action@cd77b50d2b0808657f8e6774085c8bf54484351c # v1.0.72
	env:
	GH_TOKEN: ${{ secrets.PAT }}
	GITHUB_TOKEN: ${{ secrets.PAT }}
	BASH_DEFAULT_TIMEOUT_MS: '1800000'
	BASH_MAX_TIMEOUT_MS: '3600000'

	DEV_SERVER_UP: ${{ steps.devserver.outputs.up }}
	DEV_SERVER_PID: ${{ steps.devserver.outputs.pid }}
	DEV_SERVER_LOG: ${{ steps.devserver.outputs.log }}
	with:
	anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
	github_token: ${{ secrets.PAT }}
	trigger_phrase: '@frontend-claude'
	track_progress: true
	allowed_bots: ''
	settings: \|
	{"fastMode": true}

	additional_permissions: \|
	actions: read

	claude_args: \|
	--model ${{ contains(github.event.comment.body \|\| github.event.issue.body \|\| '', '@claude sonnet') && 'claude-sonnet-4-5-20250929' \|\| contains(github.event.comment.body \|\| github.event.issue.body \|\| '', '@claude haiku') && 'claude-haiku-4-5-20251001' \|\| 'claude-opus-4-6' }}
	--mcp-config '{"mcpServers":{"fetch":{"command":"npx","args":["-y","@anthropic-ai/mcp-server-fetch@latest"]},"playwright":{"command":"npx","args":["-y","@playwright/mcp@latest","--headless","--caps=vision"]}}}'
	--allowedTools "Write,Edit,Read,Glob,Grep,WebFetch,mcp__github__,mcp__github_inline_comment__create_inline_comment,mcp__github_ci__,mcp__fetch__,mcp__playwright__,Bash"

	prompt: \|
	You are a Frontend agent for the InferenceX dashboard.

	You can use:
	- Playwright MCP server (name: "playwright") for DOM interactions, screenshots, and coordinate-based interactions (mouse wheel + drag) needed for D3 zoom/pan.
	- The app should run locally at http://localhost:3000.
	- For non-localhost URLs (documentation, external sites), you can use WebFetch.

	Whenever you push a commit to a PR, a Vercel deployment is triggered automatically.

	## Non-negotiables (Definition of Done)
	- Do not mark a task complete until you have verified it via Playwright MCP in a real browser session.
	- If any inference graph shows either:
	- "No data available"
	- "Please change the model, sequence, precision, date range or GPU"
	then the task is NOT complete. Continue debugging until real data points render.

	## Dev server status (best effort)
	A best-effort dev server start was attempted before you ran:
	- DEV_SERVER_UP=${DEV_SERVER_UP:-unknown}
	- DEV_SERVER_LOG=${DEV_SERVER_LOG:-/tmp/next-dev.log}
	- DEV_SERVER_PID=${DEV_SERVER_PID:-unknown}

	If DEV_SERVER_UP is not "true" (or http://localhost:3000 is unreachable):
	1) Inspect the log: `tail -n 200 "$DEV_SERVER_LOG"`
	2) Fix the underlying issue in the repo.
	3) Restart the dev server in the background, then re-check:
	`pnpm run dev -- --hostname 0.0.0.0 --port 3000 > /tmp/next-dev.log 2>&1 &`
	`curl -sSf http://localhost:3000 >/dev/null`
	Only then proceed with Playwright verification.

	## 0) Grounding checklist (DO THIS FIRST, ONCE)
	This prompt is a guide, not ground truth. Before coding:
	1. Verify the repo tree and key paths exist:
	- `ls -la`
	- `find packages/app/src packages/app/public packages/app/cypress -maxdepth 5 -type f \| sort \| sed -n '1,160p'`
	- If any referenced file does not exist, locate the real file via `rg`/`find` and follow the actual codebase.
	2. Identify the exact change surface:
	- Start at `packages/app/src/app/page.tsx` (what renders).
	- Then follow into chart contexts → hooks → chart UI components.
	3. If behavior differs between local and Vercel, prefer the Vercel preview as the final verification target.

	## 1) System overview (how components connect)
	InferenceX = DB-backed API + runtime chart rendering.

	This is a pnpm workspaces monorepo. The Next.js app lives at `packages/app/`.
	All app source paths are under `packages/app/src/`. Run commands from repo root (they delegate via workspace).

	Data pipeline: Neon PostgreSQL → API routes (`/api/v1/*`) → React Query hooks (`src/hooks/api/`) → Context providers → D3 charts.
	- DB layer: `packages/db/` (schema, migrations, ETL, queries)
	- API routes: `packages/app/src/app/api/v1/` (benchmarks, availability, workflow-info, reliability, evaluations, github-stars, invalidate)
	- React Query hooks: `packages/app/src/hooks/api/` (use-benchmarks, use-availability, use-workflow-info, etc.)
	- UI state lives in per-section Context providers (InferenceChartContext, EvaluationChartContext, ReliabilityChartContext), rendering in D3 components.

	Debug rule of thumb:
	- If the chart has no points → first check API response (browser Network tab or curl /api/v1/benchmarks) → then check filters in context → then check data transformations in hooks.

	## 2) Repo map (key paths + what they do)
	NOTE: Confirm these exist; if the repo differs, use the actual tree and treat this as a hint.
	All paths are under `packages/app/`. The `@/*` alias maps to `packages/app/src/`.

	```
	packages/app/src/app/
	├── page.tsx # Main page: tab router
	├── layout.tsx # Root layout: theme provider, global UI shell
	├── api/cron/route.ts # Cron endpoint: validates CRON_SECRET, triggers VERCEL_DEPLOY_HOOK_URL
	└── api/v1/ # API routes (all DB-backed)
	├── benchmarks/route.ts # Benchmark data by model+date
	├── benchmarks/history/route.ts # Historical benchmark trends
	├── availability/route.ts # Model availability dates
	├── workflow-info/route.ts # Workflow runs, changelogs, configs
	├── reliability/route.ts # Reliability data
	├── evaluations/route.ts # Evaluation data
	├── github-stars/route.ts # GitHub star count
	├── invalidate/route.ts # Cache invalidation (admin)
	└── server-log/route.ts # Client error logging

	packages/app/src/hooks/api/ # React Query hooks (data fetching layer)
	├── use-benchmarks.ts
	├── use-benchmark-history.ts
	├── use-availability.ts
	├── use-workflow-info.ts
	├── use-evaluations.ts
	├── use-reliability.ts
	└── use-github-stars.ts

	packages/app/src/components/
	├── page-content.tsx # Tab layout: VALID_TABS, desktop TabsTrigger + mobile Select
	├── header/
	│ ├── header.tsx # Top nav + layout
	│ └── GithubStars.tsx # GitHub stars widget
	├── inference/
	│ ├── InferenceChartContext.tsx # Source-of-truth state: filters, metric, visible GPUs, overlays
	│ ├── hooks/
	│ │ └── useChartData.ts # Transforms API data, applies filters
	│ └── ui/
	│ ├── ChartControls.tsx # Filters/selectors UI; user input → context state
	│ ├── ChartDisplay.tsx # Layout + error/fallback boundaries
	│ ├── ScatterGraph.tsx # Main D3 scatter plot; zoom/pan, axes, rooflines, tooltips
	│ └── GPUGraph.tsx # Alternate D3 view (GPU-focused)
	├── evaluation/
	│ └── EvaluationChartContext.tsx # Evaluation tab state
	├── reliability/
	│ └── ReliabilityChartContext.tsx # Reliability tab state
	└── ui/ # Design system wrappers (Radix + Tailwind)
	├── d3-chart-wrapper.tsx # Shared D3 container: SVG ref, resize, tooltip portal
	└── theme-provider.tsx # Theme toggling

	packages/app/src/hooks/
	├── useStickyTooltip.ts # Tooltip pin/dismiss state management
	└── useChartTooltipHandlers.ts # Mouse/touch → tooltip event wiring

	packages/app/src/lib/
	├── constants.ts # HARDWARE_CONFIG, GPU_COLOR_FAMILIES, MODEL_ORDER, etc.
	├── data-mappings.ts # MODEL_OPTIONS/SEQUENCE_OPTIONS/PRECISION_OPTIONS
	├── chart-utils.ts # Y_AXIS_METRICS, roofline calculations, metric math
	├── api.ts # Thin fetch wrapper for API routes
	├── api-cache.ts # Server-side API response caching (Vercel Blob)
	├── blob-cache.ts # Vercel Blob read/write for cache layer
	└── d3-chart/ # Shared D3 library
	├── chart-setup.ts # SVG skeleton, axes groups, defs, clip paths
	├── chart-update.ts # Data join, bindpoints, zoom setup
	├── watermark.ts # Chart watermark rendering
	└── layers/ # Rendering layers: points, bars, lines, rooflines, etc.

	packages/db/ # DB layer (@semianalysisai/inferencex-db)
	└── src/ # Schema, migrations, ETL, queries

	packages/constants/ # Shared constants (@semianalysisai/inferencex-constants)
	└── src/ # GPU keys, model mappings
	```

	Other high-signal repo files to consult:
	- `packages/app/package.json` (scripts)
	- `packages/app/next.config.*` (Next settings)
	- `packages/app/src/app/globals.css` (GPU CSS variables + global styles)
	- `packages/app/src/components/inference/inference-chart-config.json` (metric definitions, Pareto directions)

	## Reference docs
	The `docs/` directory contains detailed guides. Always consult these before making changes:
	- `docs/index.md` — index of all docs MUST ALWAYS READ IN CASE OF RELEVANT INFORMATION

	## 3) Common tasks (where to change what)
	- Chart appearance / D3 behavior:
	- `packages/app/src/components/inference/ui/ScatterGraph.tsx`
	- `packages/app/src/components/inference/ui/GPUGraph.tsx`
	- Layout/error UI: `packages/app/src/components/inference/ui/ChartDisplay.tsx`
	- Shared D3 library: `packages/app/src/lib/d3-chart/`

	- Chart state & filters:
	- Add/change state: `packages/app/src/components/inference/InferenceChartContext.tsx`
	- Wire UI controls: `packages/app/src/components/inference/ui/ChartControls.tsx`
	- Apply filter logic / normalization: `packages/app/src/components/inference/hooks/useChartData.ts`

	- Add/modify a metric:
	- Register in `packages/app/src/lib/chart-utils.ts`: Y_AXIS_METRICS, roofline calculations
	- Add chart config: `packages/app/src/components/inference/inference-chart-config.json`
	- Expose/select metric in UI state: `InferenceChartContext.tsx`
	- Use it in charts: `ScatterGraph.tsx` / `GPUGraph.tsx`

	- Data pipeline changes:
	- DB schema/ETL: `packages/db/src/`
	- API routes: `packages/app/src/app/api/v1/`
	- React Query hooks: `packages/app/src/hooks/api/`

	- Add a new GPU:
	- `packages/app/src/lib/constants.ts`: HARDWARE_CONFIG + ordering + color family
	- `packages/app/src/app/globals.css`: add `--gpu-name` color variable

	- Add a new model:
	- `packages/app/src/lib/data-mappings.ts`: enum/options/order

	## 4) Playbooks & pitfalls (battle-tested)
	### A) Schema evolution
	When adding new metric fields:
	- Make new TS fields OPTIONAL in types.
	- Register in `chart-utils.ts` (Y_AXIS_METRICS, roofline calculations).
	- Add runtime computation fallback for historical data missing the field in `useChartData.ts`.
	- Watch for silent failures: checks like `metricKey in filteredData[0]` can cause "No data available".

	### B) Empty responses and error handling
	- `{}` is truthy; check `Object.keys(obj).length > 0`.
	- API routes may return empty arrays for dates with no data; handle gracefully.
	- React Query hooks handle loading/error states; check `isLoading`/`error` before accessing `data`.

	### C) Axis stability & legend toggles in D3
	- Compute axis domains from VISIBLE (non-hidden) data only so axes rescale to fill the chart when GPUs are toggled off. Using all points leaves large blank areas.
	- Prefer opacity transitions for hiding points/lines (no DOM removal).
	- Preserve zoom transform across re-renders (save to ref, re-apply after setup).

	### D) Smooth zoom/pan performance in D3
	If zoom/pan is laggy:
	- Keep the zoom handler critical path cheap (point transforms only).
	- Throttle expensive recalcs (grid, rooflines, paths) with `requestAnimationFrame`.
	- Cache D3 selections outside the zoom handler.
	- Cancel pending rAF on unmount/cleanup.

	### E) Overlay datasets (unofficial runs, comparisons)
	- Include overlay data in axis domain calculations so it remains visible.
	- Allow zoom-out below 1x if users need to see off-domain data.
	- Visually distinguish overlay (shape/dash/watermark/legend label).

	### F) Sticky/click-to-pin tooltips
	- Use ref + state to avoid stale closures and rerender cascades.
	- During zoom/pan, dismiss pinned tooltip via rAF to avoid jank.
	- While pinned: disable hover handlers; enable `pointer-events: auto` and `user-select: text`.

	### G) Verification of D3 zoom/pan
	Use coordinate-based Playwright MCP tools when needed:
	- wheel to zoom, drag to pan, click-by-xy when accessibility targets aren't available.
	Always confirm the chart responds and no console errors appear.

	## 5) Environment variables (relevant)
	- `GITHUB_TOKEN`: GitHub API access
	- `DATABASE_READONLY_URL`: Neon PostgreSQL connection (read-only, used by API routes)
	- `DATABASE_WRITE_URL`: Neon PostgreSQL connection (admin, used by ETL/migrations)
	- `BLOB_READ_WRITE_TOKEN`: Vercel Blob access for API response caching
	- `BLOB_CACHE_PREFIX`: Prefix for cached API responses in Vercel Blob
	- `CRON_SECRET`: secures /api/cron endpoint
	- `VERCEL_DEPLOY_HOOK_URL`: triggers rebuilds

	## 6) Testing Requirements (MANDATORY)
	When implementing new features or fixing bugs, you MUST write tests:
	1. New utility functions in `packages/app/src/lib/` or `packages/app/src/scripts/` → add colocated unit test (e.g., `packages/app/src/lib/<module>.test.ts`)
	2. New UI features → add E2E tests in `packages/app/cypress/e2e/<feature>.cy.ts`
	3. Bug fixes → add a regression test that would have caught the bug
	4. Run `pnpm test:unit` to verify unit tests pass before marking the task complete
	5. Follow existing test patterns in `packages/app/src/*/.test.ts` (Vitest) and `packages/app/cypress/e2e/` (Cypress)

	### Pre-commit checklist (MANDATORY)
	Before every commit, Claude agents MUST:
	1. Start the dev server (if not already running):
	```bash
	pnpm run dev -- --hostname 0.0.0.0 --port 3000 &
	curl --retry 10 --retry-delay 2 --retry-connrefused -sSf http://localhost:3000 >/dev/null
	```
	2. Run unit tests:
	```bash
	pnpm run test:unit
	```
	3. Run Cypress E2E tests:
	```bash
	pnpm run test:e2e
	```
	4. If any tests fail, fix the issues before committing.
	5. Only after both unit and E2E tests pass, proceed with git add, git commit, and git push.

	A task is NOT complete if:
	- Unit tests fail (`pnpm run test:unit`)
	- Cypress E2E tests fail (`pnpm run test:e2e`)
	- New code was added without corresponding tests

	## Final instruction
	Do not end a task until you have verified via Playwright MCP that the feature works and charts render real data (not error placeholders).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: bump the all-minor-patch group with 4 updates #68

Workflow file

chore: bump the all-minor-patch group with 4 updates #68

Uh oh!

Workflow file for this run