Skip to content

feat: scrape UI#2140

Open
adityathebe wants to merge 3 commits into
mainfrom
feat/scrape-ui
Open

feat: scrape UI#2140
adityathebe wants to merge 3 commits into
mainfrom
feat/scrape-ui

Conversation

@adityathebe
Copy link
Copy Markdown
Member

@adityathebe adityathebe commented Apr 20, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Added interactive Scrape UI web interface for viewing and analyzing scrape results, featuring configuration explorer, relationship visualization, user/group/role management, access logs, and HTTP traffic capture.
    • New ui command to launch the browser-based UI from saved JSON results files.
  • Chores

    • Updated build system and Docker configuration to compile and integrate frontend components with the application.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 20, 2026

Warning

Rate limit exceeded

@adityathebe has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 33 minutes and 20 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 33 minutes and 20 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a4cd0411-20d7-4be1-994c-740ebfdeb3b2

📥 Commits

Reviewing files that changed from the base of the PR and between ac307f4 and e6f0988.

⛔ Files ignored due to path filters (3)
  • cmd/scrapeui/frontend/dist/scrapeui.css is excluded by !**/dist/**
  • cmd/scrapeui/frontend/dist/scrapeui.js is excluded by !**/dist/**
  • cmd/scrapeui/frontend/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (12)
  • Makefile
  • cmd/run.go
  • cmd/scrapeui/assets.go
  • cmd/scrapeui/frontend/.gitignore
  • cmd/scrapeui/frontend/package.json
  • cmd/scrapeui/frontend/src/components/AliasList.tsx
  • cmd/scrapeui/frontend/src/components/DetailPanel.tsx
  • cmd/scrapeui/frontend/src/index.tsx
  • cmd/scrapeui/frontend/src/styles.css
  • cmd/scrapeui/safe_buffer.go
  • cmd/scrapeui/server.go
  • cmd/scrapeui/types.go

Walkthrough

This PR introduces a complete web-based Scrape UI for viewing and analyzing scrape results in real-time during execution or from saved JSON files. The implementation includes a Go HTTP server backend, a TypeScript/Preact frontend application with interactive components, and integrated build tooling for both backend and frontend.

Changes

Cohort / File(s) Summary
Build System & Package Configuration
.gitignore, Makefile, build/Dockerfile, cmd/scrapeui/frontend/package.json, cmd/scrapeui/frontend/tsconfig.json, cmd/scrapeui/frontend/vite.config.ts, cmd/scrapeui/frontend/.gitignore
Added npm-based frontend build pipeline integrated into main build system. Makefile includes new scrapeui-build target; Dockerfile extends with Node.js multi-stage build; Vite configured for IIFE bundle generation and API file serving; JSON files no longer globally ignored.
Go Backend Server
cmd/scrapeui/types.go, cmd/scrapeui/server.go, cmd/scrapeui/convert.go, cmd/scrapeui/assets.go
New scrapeui package providing HTTP server, type definitions, data transformation utilities, and embedded frontend assets. Server manages scraper progress, scrape results, streaming updates, and serves both SPA shell and REST/SSE APIs for real-time and static snapshot viewing.
CLI Integration
cmd/root.go, cmd/run.go, cmd/ui.go
Register UI command; extend run command with optional --ui and --ui-port flags for live scrape visualization; add standalone ui command for viewing saved JSON results via web interface.
Frontend Entry & Core Types
cmd/scrapeui/frontend/src/index.tsx, cmd/scrapeui/frontend/src/types.ts, cmd/scrapeui/frontend/src/globals.d.ts
Frontend application root and TypeScript type definitions for scraper progress, results, external entities, relationships, snapshots, and UI state. Build-time constants declared for version/commit injection.
Frontend App & Utilities
cmd/scrapeui/frontend/src/App.tsx, cmd/scrapeui/frontend/src/utils.ts
Main App component managing tab navigation, EventSource streaming, snapshot state, and derived UI data (counts, groupings, search). Utilities provide formatting, grouping, health/status mapping, entity lookups, and global search across all result sections.
Frontend Hooks
cmd/scrapeui/frontend/src/hooks/useRoute.ts, cmd/scrapeui/frontend/src/hooks/useSort.tsx
URL route parsing/building hook (useRoute) for tab-driven navigation; sorting hook (useSort) for column sorting with state management and visual indicators.
Frontend Components: Config & Scraper Views
cmd/scrapeui/frontend/src/components/ConfigTree.tsx, cmd/scrapeui/frontend/src/components/ConfigNode.tsx, cmd/scrapeui/frontend/src/components/DetailPanel.tsx, cmd/scrapeui/frontend/src/components/ScraperList.tsx
Components for displaying configuration hierarchy, individual config items with status/counts, detailed config information (changes, relationships, access, metadata), and scraper execution progress.
Frontend Components: Filter & Table Views
cmd/scrapeui/frontend/src/components/FilterBar.tsx, cmd/scrapeui/frontend/src/components/EntityTable.tsx, cmd/scrapeui/frontend/src/components/AccessTable.tsx, cmd/scrapeui/frontend/src/components/AccessLogTable.tsx
Reusable components for filtering by health/type, rendering sortable/searchable tables of entities (users/groups/roles) with membership tracking, config access entries, and access log entries with JSON expansion.
Frontend Components: Data Display & Visualization
cmd/scrapeui/frontend/src/components/HARPanel.tsx, cmd/scrapeui/frontend/src/components/SnapshotPanel.tsx, cmd/scrapeui/frontend/src/components/ScrapeConfigPanel.tsx, cmd/scrapeui/frontend/src/components/Summary.tsx, cmd/scrapeui/frontend/src/components/JsonView.tsx, cmd/scrapeui/frontend/src/components/AnsiHtml.tsx, cmd/scrapeui/frontend/src/components/SplitPane.tsx, cmd/scrapeui/frontend/src/components/AliasList.tsx
Components for rendering HTTP traffic (HAR), before/after snapshots with per-scraper/type/entity statistics, scrape configuration and log levels, summary counts and badges, nested JSON with expand/collapse, ANSI-colored terminal output, resizable split panels, and alias lists with copy actions.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Scraper as Config Scraper
    participant Server as Scrape UI Server
    participant Browser as Browser/Client
    
    User->>Scraper: config-db run --ui<br/>[starts scraper]
    Scraper->>Server: startScrapeUI()<br/>[creates HTTP server]
    Server->>Browser: listen & open URL<br/>[localhost:port]
    
    Browser->>Server: GET / [SPA shell]
    Server->>Browser: HTML + bundled JS
    Browser->>Server: GET /api/scrape<br/>[initial snapshot]
    Server->>Browser: JSON snapshot
    Browser->>Browser: initialize state
    
    Browser->>Server: EventSource /api/scrape/stream
    Server->>Server: activate streaming
    
    loop Per Scraper Execution
        Scraper->>Server: UpdateScraper(name, Running,<br/>results, summary)
        Server->>Server: merge results,<br/>update state
        Server->>Browser: SSE message
        Browser->>Browser: parse snapshot<br/>update UI
        Note over Browser: configs, counts,<br/>relationships refresh
    end
    
    Scraper->>Server: UpdateScraper(...,<br/>Complete/Error)
    Server->>Server: state change
    Server->>Browser: SSE: { done: true }
    Browser->>Browser: stop polling,<br/>show final state
    Browser->>Server: (optional) download<br/>results via /api/config/*
    User->>Browser: close window / SIGINT
Loading
sequenceDiagram
    actor User
    participant CLI as CLI<br/>(ui command)
    participant File as File System
    participant Server as Scrape UI Server
    participant Browser as Browser/Client
    
    User->>CLI: config-db ui results.json<br/>--ui-port 9001
    CLI->>File: read file
    File->>CLI: JSON contents
    CLI->>CLI: unmarshal &<br/>transform data
    CLI->>Server: NewStaticServer(snapshot)
    Server->>Server: initialize with<br/>static snapshot
    Server->>Server: listen on :9001
    CLI->>Browser: open URL<br/>localhost:9001
    
    Browser->>Server: GET /
    Server->>Browser: SPA shell HTML
    Browser->>Server: GET /api/scrape<br/>[snapshot]
    Server->>Browser: JSON snapshot<br/>(no streaming)
    Browser->>Browser: render UI<br/>with all data
    Browser->>Server: GET /api/config/{id}<br/>[optional details]
    Server->>Browser: config JSON
    User->>Browser: interact & explore
Loading

Possibly related PRs

  • Pr/slop #2117: Modifies the same scrapeui subsystem with complementary changes to server implementation, API endpoints, and frontend integration patterns.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: scrape UI' directly and clearly describes the main feature addition—a new scrape UI system with frontend and backend components.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/scrape-ui
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch feat/scrape-ui

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 20, 2026

Benchstat

Base: d03327aca1ad5a3ad29ed7282ab1c0f830cfceb4
Head: e6f098872ba5318b84f42472d47dcd6faf12e134

📊 2 minor regression(s) (all within 5% threshold)

Benchmark Base Head Change p-value
BenchSaveResultsUpdateUnchanged/N=1000-4 554.6m 571.3m +3.01% 0.002
BenchSaveResultsUpdateChanged/N=1000-4 909.4m 934.8m +2.80% 0.002
Full benchstat output
goos: linux
goarch: amd64
pkg: github.com/flanksource/config-db/bench
cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
                                         │ bench-base.txt │          bench-head.txt           │
                                         │     sec/op     │   sec/op     vs base              │
BenchSaveResultsSeed/N=1000-4                 506.4m ± 4%   519.8m ± 6%       ~ (p=0.180 n=6)
BenchSaveResultsUpdateUnchanged/N=1000-4      554.6m ± 1%   571.3m ± 1%  +3.01% (p=0.002 n=6)
BenchSaveResultsUpdateChanged/N=1000-4        909.4m ± 1%   934.8m ± 3%  +2.80% (p=0.002 n=6)
geomean                                       634.5m        652.3m       +2.82%

                                         │ bench-base.txt │           bench-head.txt           │
                                         │      MB/s      │    MB/s     vs base                │
BenchSaveResultsSeed/N=1000-4                0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
BenchSaveResultsUpdateUnchanged/N=1000-4     0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
BenchSaveResultsUpdateChanged/N=1000-4       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
geomean                                                 ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                         │ bench-base.txt │           bench-head.txt           │
                                         │      B/op      │     B/op      vs base              │
BenchSaveResultsSeed/N=1000-4                36.32Mi ± 0%   36.29Mi ± 0%       ~ (p=0.093 n=6)
BenchSaveResultsUpdateUnchanged/N=1000-4     25.85Mi ± 0%   25.78Mi ± 1%       ~ (p=0.180 n=6)
BenchSaveResultsUpdateChanged/N=1000-4       75.09Mi ± 0%   75.12Mi ± 0%       ~ (p=0.589 n=6)
geomean                                      41.31Mi        41.27Mi       -0.11%

                                         │ bench-base.txt │          bench-head.txt           │
                                         │   allocs/op    │  allocs/op   vs base              │
BenchSaveResultsSeed/N=1000-4                 443.1k ± 0%   443.1k ± 0%       ~ (p=0.851 n=6)
BenchSaveResultsUpdateUnchanged/N=1000-4      315.1k ± 0%   315.1k ± 0%       ~ (p=0.288 n=6)
BenchSaveResultsUpdateChanged/N=1000-4        902.2k ± 0%   902.2k ± 0%       ~ (p=0.288 n=6)
geomean                                       501.3k        501.3k       +0.00%

@socket-security
Copy link
Copy Markdown

socket-security Bot commented Apr 20, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addednpm/​@​tailwindcss/​cli@​4.2.21001007798100
Addednpm/​iconify-icon@​3.0.29910010082100
Addednpm/​vite@​6.4.2911008399100
Addednpm/​tailwindcss@​4.2.21001008498100
Addednpm/​typescript@​5.9.31001009010090
Addednpm/​@​preact/​preset-vite@​2.10.59910010092100
Addednpm/​preact@​10.29.110010010095100

View full report

@socket-security
Copy link
Copy Markdown

socket-security Bot commented Apr 20, 2026

Warning

Review the following alerts detected in dependencies.

According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.

Action Severity Alert  (click "▶" to expand/collapse)
Warn High
Obfuscated code: npm entities is 91.0% likely obfuscated

Confidence: 0.91

Location: Package overview

From: cmd/scrapeui/frontend/package-lock.jsonnpm/@preact/preset-vite@2.10.5npm/entities@4.5.0

ℹ Read more on: This package | This alert | What is obfuscated code?

Next steps: Take a moment to review the security alert above. Review the linked package source code to understand the potential risk. Ensure the package is not malicious before proceeding. If you're unsure how to proceed, reach out to your security team or ask the Socket team for help at support@socket.dev.

Suggestion: Packages should not obfuscate their code. Consider not using packages with obfuscated code.

Mark the package as acceptable risk. To ignore this alert only in this pull request, reply with the comment @SocketSecurity ignore npm/entities@4.5.0. You can also ignore all packages with @SocketSecurity ignore-all. To ignore an alert for all future pull requests, use Socket's Dashboard to change the triage state of this alert.

View full report

@adityathebe adityathebe marked this pull request as ready for review April 21, 2026 03:31
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

🟡 Minor comments (12)
cmd/scrapeui/frontend/src/components/ConfigNode.tsx-31-32 (1)

31-32: ⚠️ Potential issue | 🟡 Minor

isNew fallback may over-mark unchanged items as "new".

isNew = item.Action === 'inserted' || (!item.Action && !!item.created_at) flags every config that just lacks an Action tag but has a created_at (nearly every config ever scraped). In runs that don't populate Action for all items (e.g., --no-save), the green "New" dot will appear on every node. If the intent is "new only when the scraper explicitly says so", the fallback should probably be compared against the current scrape's start time rather than merely "has a created_at".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/scrapeui/frontend/src/components/ConfigNode.tsx` around lines 31 - 32,
The isNew logic (variable isNew using item.Action and item.created_at) is
over-broad and marks many unchanged configs as "New"; instead make "new" only
when the scraper explicitly indicates insertion or when created_at is after the
current scrape start time: update the isNew calculation in ConfigNode (the isNew
assignment referencing item.Action and item.created_at) to compare
item.created_at against the current scrape run start timestamp (pass in or read
a scrapeStartTime/ currentRunStart prop/context) and only set isNew true when
item.Action === 'inserted' OR (no Action AND item.created_at exists AND new
Date(item.created_at) > scrapeStartTime). Ensure you add/propagate the
scrapeStartTime value into ConfigNode where it is rendered so the comparison can
be performed.
cmd/scrapeui/frontend/src/components/AnsiHtml.tsx-28-35 (1)

28-35: ⚠️ Potential issue | 🟡 Minor

ESC[m (empty params) is treated as a no-op but should reset styles.

Per ECMA-48 SGR, an empty parameter list (ESC[m) is equivalent to ESC[0m and must reset all attributes. Here, match[1].split(';').filter(Boolean) drops the empty string, so codes is empty and the styles stack is left unchanged. Any upstream tool emitting a bare \x1b[m reset will cause subsequent text to stay styled incorrectly.

🐛 Proposed fix
-    const codes = match[1].split(';').filter(Boolean);
-    for (const code of codes) {
-      if (code === '0' || code === '') {
-        styles = [];
-      } else if (ANSI_COLORS[code]) {
-        styles.push(ANSI_COLORS[code]);
-      }
-    }
+    const raw_codes = match[1];
+    const codes = raw_codes === '' ? [''] : raw_codes.split(';');
+    for (const code of codes) {
+      if (code === '0' || code === '') {
+        styles = [];
+      } else if (ANSI_COLORS[code]) {
+        styles.push(ANSI_COLORS[code]);
+      }
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/scrapeui/frontend/src/components/AnsiHtml.tsx` around lines 28 - 35, The
code currently calls match[1].split(';').filter(Boolean) which removes an empty
parameter list so ESC[m is treated as a no-op; change the logic in AnsiHtml.tsx
(around the codes variable and the loop over codes) so that an empty params list
is treated as ['0'] (or otherwise handle when match[1] === '' to push a reset)
and then process codes normally; ensure the styles array is cleared when
encountering code '0' (as handled in the existing if (code === '0' || code ===
'') branch) and do not rely on filter(Boolean) to drop the empty reset token so
bare "\x1b[m" properly resets styles.
cmd/scrapeui/frontend/src/components/AccessLogTable.tsx-98-100 (1)

98-100: ⚠️ Potential issue | 🟡 Minor

Row expansion state sticks to the slot, not the entry.

AccessLogRow keeps local open state, but the list uses key={i}. After a sort or search-filter change, Preact reuses the same row instance for a different entry, so the expanded detail panel can appear under a row the user didn't click. Access logs have no single id field, but a composite key over stable fields is usually enough:

-          {sorted.map((e, i) => <AccessLogRow key={i} entry={e} lookups={lookups} />)}
+          {sorted.map((e, i) => (
+            <AccessLogRow
+              key={`${e.created_at ?? ''}|${e.external_user_id ?? ''}|${e.external_config_id ?? ''}|${i}`}
+              entry={e}
+              lookups={lookups}
+            />
+          ))}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/scrapeui/frontend/src/components/AccessLogTable.tsx` around lines 98 -
100, The row expansion state is sticky because rows are keyed by index
(key={i}), so Preact reuses component instances across different entries; change
the key to a stable composite identifier derived from the entry (instead of i)
or lift the open state up to the parent and key by that identifier. Concretely,
in the sorted.map call replace key={i} with a stable key built from immutable
fields of the entry (e.g., `${entry.timestamp}-${entry.source}-${entry.level}`
or JSON.stringify of a small subset) or implement a parent-level openState map
keyed by that composite id and pass down open/openToggle props to AccessLogRow
so expansion follows the entry, not the slot.
cmd/scrapeui/frontend/src/components/AccessTable.tsx-110-112 (1)

110-112: ⚠️ Potential issue | 🟡 Minor

Use entry.id as the React key instead of the array index.

AccessRow owns local open state. Keying rows by index means that when sort/search reorders sorted, the expanded/collapsed state sticks to the slot rather than to the entry — collapsing one entry and “expanding” a different one after a sort. entry.id is already unique and available.

-          {sorted.map((e, i) => <AccessRow key={i} entry={e} lookups={lookups} />)}
+          {sorted.map(e => <AccessRow key={e.id} entry={e} lookups={lookups} />)}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/scrapeui/frontend/src/components/AccessTable.tsx` around lines 110 - 112,
The rows are keyed by array index which breaks AccessRow's local open state when
sorted or filtered; change the map to use each entry's unique id (use entry.id
or e.id) as the React key instead of the index (key={i}) so the component state
remains tied to the entry; ensure you pass the same id property from sorted
entries when rendering AccessRow.
cmd/ui.go-72-97 (1)

72-97: ⚠️ Potential issue | 🟡 Minor

Server lifecycle: swallowed errors, race on startup, no graceful shutdown.

A few related concerns with the serve/teardown block:

  • go http.Serve(listener, srv.Handler()) silently drops any error via //nolint:errcheck. If Serve returns for a reason other than a clean close, the user sees no indication and the process keeps running idle until SIGINT.
  • time.Sleep(100 * time.Millisecond) before logging/opening the browser is a race — with a net.Listener already bound, the server is ready to Accept as soon as http.Serve starts, so the sleep is both unnecessary and not actually a guarantee.
  • On SIGINT/SIGTERM the code simply exits; the *http.Server is never Shutdown(), so in-flight requests (e.g. an SSE stream) are abruptly terminated and the listener isn't closed deterministically.
🛠️ Suggested shape
-		go http.Serve(listener, srv.Handler()) //nolint:errcheck
-
-		time.Sleep(100 * time.Millisecond)
-		logger.Infof("Scrape UI at %s", url)
-		openBrowser(url)
-
-		sig := make(chan os.Signal, 1)
-		signal.Notify(sig, syscall.SIGINT, syscall.SIGTERM)
-		<-sig
+		httpSrv := &http.Server{
+			Handler:           srv.Handler(),
+			ReadHeaderTimeout: 10 * time.Second,
+		}
+		serveErr := make(chan error, 1)
+		go func() { serveErr <- httpSrv.Serve(listener) }()
+
+		logger.Infof("Scrape UI at %s", url)
+		openBrowser(url)
+
+		sig := make(chan os.Signal, 1)
+		signal.Notify(sig, syscall.SIGINT, syscall.SIGTERM)
+		select {
+		case <-sig:
+		case err := <-serveErr:
+			if err != nil && err != http.ErrServerClosed {
+				logger.Fatalf("UI server error: %v", err)
+			}
+		}
+		ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
+		defer cancel()
+		_ = httpSrv.Shutdown(ctx)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/ui.go` around lines 72 - 97, The serve block currently launches the
server with go http.Serve(listener, srv.Handler()) ignoring Serve errors, uses a
fragile time.Sleep to assume readiness, and never calls Shutdown on
SIGINT/SIGTERM; fix by creating an http.Server (assign to a variable like
srvServer) with Handler set to srv.Handler(), start it in a goroutine and relay
any error into a channel, remove the time.Sleep and instead block until the
server goroutine signals readiness (or until Listen succeeded) before
logging/opening the browser, and on receiving the OS signal call
srvServer.Shutdown(ctx) with a timeout context and close the listener, then wait
for the Serve goroutine to return and log any non-nil error (other than
http.ErrServerClosed) so Serve failures are surfaced; reference
scrapeui.NewStaticServer, srv.Handler(), listener, http.Serve, openBrowser, and
the signal handling logic to locate and update the code.
cmd/scrapeui/frontend/vite.config.ts-88-115 (1)

88-115: ⚠️ Potential issue | 🟡 Minor

Wrap decodeURIComponent in try-catch to handle malformed percent-encoding.

Line 89: decodeURIComponent() throws URIError on invalid input and will crash the middleware. Wrap in try/catch:

Suggested fix
      server.middlewares.use('/api/config/', (req, res) => {
-        const id = decodeURIComponent((req.url || '').replace(/^\//, ''));
+        let id: string;
+        try {
+          id = decodeURIComponent((req.url || '').replace(/^\//, ''));
+        } catch {
+          res.writeHead(400);
+          res.end('bad request');
+          return;
+        }

The substring match on line 101 (c.source.includes(id)) matches production server behavior (verified in server.go), so no change needed there.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/scrapeui/frontend/vite.config.ts` around lines 88 - 115, The middleware
registered via server.middlewares.use that computes id using
decodeURIComponent((req.url || '').replace(/^\//, '')) must guard against
malformed percent-encoding: wrap the decodeURIComponent call in a try/catch (or
validate beforehand) inside that request handler, and on catch (URIError) return
a non-500 response (e.g., 400 Bad Request or 404) with a short message and end
the response so the process doesn't crash; update the code around the id
computation in that middleware callback to implement this handling.
cmd/scrapeui/frontend/src/components/EntityTable.tsx-66-83 (1)

66-83: ⚠️ Potential issue | 🟡 Minor

Several column headers are sortable in the UI but resolve to undefined.

useSort resolves the sort key via a dot-path on each Entity. The columns aliases, groups, and members have no corresponding field on Entity (groups/members live in the derived resolvedUserGroups map), so clicking these headers flips the indicator but leaves the ordering effectively random (all values compare as null/equal). Either skip cursor-pointer/onClick + SortIcon for these columns, or project the membership count / alias summary onto the entity before sorting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/scrapeui/frontend/src/components/EntityTable.tsx` around lines 66 - 83,
The column headers for aliases, groups, and members in EntityTable are clickable
but use useSort with keys that don't exist on Entity (columnsFor creates keys
'aliases', 'groups', 'members' while actual membership data lives in
resolvedUserGroups and aliases may be an array), causing undefined sorts; fix by
either making those columns non-sortable in columnsFor (remove cursor-pointer /
onClick / SortIcon for entries with key 'aliases'|'groups'|'members') or by
projecting sortable values onto each Entity before calling useSort (e.g.,
compute aliasSummary or membershipCount for each entity using
resolvedUserGroups/allGroups/allUsers and pass that derived field into the array
given to useSort), and update references to columnsFor and useSort accordingly
so the sort key resolves to a concrete property on the entities.
cmd/scrapeui/frontend/src/components/HARPanel.tsx-133-135 (1)

133-135: ⚠️ Potential issue | 🟡 Minor

Row keys based on index break expand state across re-sorts.

HARRow keeps its open flag in local state keyed by React identity. Using i from sorted.map as the key means that whenever the user re-sorts (via toggle) or the search filter changes row positions, the expanded row visually “moves” to whatever entry now occupies that index. Prefer a stable key derived from the entry (e.g., startedDateTime + url + method, or an explicit index on the original entries array).

♻️ Suggested change
-          {sorted.map((e, i) => <HARRow key={i} entry={e} />)}
+          {sorted.map((e, i) => (
+            <HARRow
+              key={`${e.startedDateTime}-${e.request.method}-${e.request.url}-${i}`}
+              entry={e}
+            />
+          ))}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/scrapeui/frontend/src/components/HARPanel.tsx` around lines 133 - 135,
The row key uses the loop index causing HARRow local `open` state to jump when
`sorted` order changes; update the map in the tbody to use a stable key derived
from the entry (for example concatenate `entry.startedDateTime`,
`entry.request.url` and `entry.request.method`, or use the original index from
the `entries` source) instead of `i`, so HARRow keeps identity across re-sorts;
locate the `sorted.map((e, i) => <HARRow key={i} entry={e} />)` call and replace
the key with a stable unique string derived from the `entry` fields or original
index.
cmd/scrapeui/frontend/src/components/DetailPanel.tsx-147-150 (1)

147-150: ⚠️ Potential issue | 🟡 Minor

Use exact matching with proper identifier fields instead of substring matching on source.

The source field is a descriptive label (e.g., "GitHub Dependabot", "OpenSSF Scorecard") or source path, not an identifier for the config. Filtering changes with ch.source?.includes(item.id) risks false-positive matches whenever item.id appears as a substring of another config's source (e.g., item.id="123" matching source="Config-1234").

Match against resolved?.config_id, external_id + config_type, or config_id instead. The codebase already demonstrates the correct pattern in the matchesConfig() function above, which uses exact field matching.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/scrapeui/frontend/src/components/DetailPanel.tsx` around lines 147 - 150,
itemChanges currently uses substring matching via changes.filter(ch =>
ch.source?.includes(item.id)), which can produce false positives; update the
useMemo to perform exact config matching instead—use the existing matchesConfig
function or check exact identifier fields on each change (resolved?.config_id,
external_id with config_type, or config_id) so you only include changes that
truly belong to the given item (refer to matchesConfig() for the correct
matching logic and replace the includes-based predicate in the changes.filter
call).
cmd/scrapeui/convert.go-39-51 (1)

39-51: ⚠️ Potential issue | 🟡 Minor

Drop incomplete relationships before publishing them.

Both builders can append relationships with an empty config or related external ID; Line 170 only skips when both sides are unresolved. Those entries inflate counts and can render blank relationship/parent metadata in the UI.

🐛 Proposed fix
 	addRel := func(rel UIRelationship) {
+		if rel.ConfigExternalID == "" || rel.RelatedExternalID == "" {
+			return
+		}
 		key := rel.ConfigExternalID + "|" + rel.RelatedExternalID + "|" + rel.Relation
 		if seen[key] {
 			return
@@
 		cfgRef := byUUID[r.ConfigID]
 		relRef := byUUID[r.RelatedID]
-		if cfgRef.externalID == "" && relRef.externalID == "" {
-			continue // can't resolve either side
+		if cfgRef.externalID == "" || relRef.externalID == "" {
+			continue // can't resolve both sides
 		}

Also applies to: 167-179

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/scrapeui/convert.go` around lines 39 - 51, The addRel function is
currently appending relationships that have an empty ConfigExternalID or
RelatedExternalID (and only skips when both are unresolved); change addRel (the
closure that takes UIRelationship, uses seen map, nameByExternalID, and appends
to out) to first drop/return immediately if rel.ConfigExternalID == "" ||
rel.RelatedExternalID == "" (i.e., require both external IDs present), then
build the seen key and skip duplicates as before; continue to fill missing
ConfigName/RelatedName from nameByExternalID and append to out only after these
validations so incomplete relationships are never published.
cmd/scrapeui/frontend/src/App.tsx-134-145 (1)

134-145: ⚠️ Potential issue | 🟡 Minor

Include config IDs in config search.

Config IDs/external IDs are used in routes and relationship/access references, but this filter only checks name/type/aliases/labels/tags/config body. Searching a known ID can show no config even when it exists; mirror the same field in globalSearch so tab badges stay consistent.

🔎 Proposed fix
       const lq = search.toLowerCase();
       items = items.filter(c =>
+        c.id?.toLowerCase().includes(lq) ||
         c.name?.toLowerCase().includes(lq) ||
         c.config_type?.toLowerCase().includes(lq) ||
         c.aliases?.some(a => a.toLowerCase().includes(lq)) ||
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/scrapeui/frontend/src/App.tsx` around lines 134 - 145, The search
predicate inside the useMemo for filtered (which filters allConfigs via
filterItems and then .filter using lq) is missing checks for config IDs/external
IDs; update the predicate to also match lq against the config's identifier
fields (e.g., c.id and c.external_id or c.externalId as applicable) so that
searching by ID will return the config and keep tab badges consistent with
globalSearch. Add these ID checks alongside the existing c.name, c.config_type,
aliases, labels, tags and JSON.stringify(c.config) checks in the items =
items.filter(...) callback.
cmd/scrapeui/frontend/src/App.tsx-109-132 (1)

109-132: ⚠️ Potential issue | 🟡 Minor

Use allConfigs when syncing route selection.

The tree includes synthetic orphaned configs, but the route sync only searches configs; selecting an orphaned item navigates to orphaned-* without populating the detail panel, and stale selection can remain when no match is found.

🐛 Proposed fix
-  // Sync selected config with URL route id (when on configs tab)
-  useEffect(() => {
-    if (tab !== 'configs') return;
-    if (!routeId) {
-      setSelected(null);
-      return;
-    }
-    if (selected?.id === routeId) return;
-    const match = configs.find(c => c.id === routeId);
-    if (match) setSelected(match);
-  }, [routeId, configs, tab]);
   const orphanedConfigs = useMemo(() => {
     return (snapshot?.issues || [])
       .filter(issue => issue.type === 'orphaned' && issue.change)
       .map((issue, i): ScrapeResult => ({
@@
 
   const allConfigs = useMemo(() => [...configs, ...orphanedConfigs], [configs, orphanedConfigs]);
+
+  // Sync selected config with URL route id (when on configs tab)
+  useEffect(() => {
+    if (tab !== 'configs') return;
+    if (!routeId) {
+      setSelected(null);
+      return;
+    }
+    setSelected(allConfigs.find(c => c.id === routeId) ?? null);
+  }, [routeId, allConfigs, tab]);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/scrapeui/frontend/src/App.tsx` around lines 109 - 132, The route-sync
useEffect currently searches only configs and can miss synthetic items; update
the effect that syncs selection (the useEffect checking tab === 'configs' and
using configs.find) to search allConfigs instead (use allConfigs.find(...)) and
include allConfigs in the effect dependency array; also ensure when no match is
found you call setSelected(null) (instead of leaving stale selection) so
orphaned-*/synthetic items correctly populate or clear the detail panel.
🧹 Nitpick comments (9)
build/Dockerfile (1)

11-11: Pin the new Node builder image by digest.

The added node:20-bookworm tag is mutable, while nearby base images are digest-pinned. Pinning this stage keeps frontend builds reproducible and aligned with the existing Dockerfile posture.

🔒 Proposed direction
-FROM node:20-bookworm AS scrapeui-builder
+FROM node:20-bookworm@sha256:<resolved-digest> AS scrapeui-builder
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@build/Dockerfile` at line 11, The Dockerfile's builder stage uses the mutable
tag "node:20-bookworm" in the FROM line for the scrapeui-builder stage; replace
that tag with the digest-pinned image (e.g., change FROM node:20-bookworm AS
scrapeui-builder to FROM node@sha256:<appropriate-digest> AS scrapeui-builder)
so the scrapeui-builder stage is immutable and reproducible—ensure you keep the
"AS scrapeui-builder" alias and use the correct sha256 digest for the desired
Node 20 bookworm image.
Makefile (1)

48-50: Forcing npm ci on every make dev will noticeably slow the Go dev loop.

scrapeui-build runs npm ci && npm run build unconditionally, and dev now depends on it. Every make dev will reinstall node_modules from scratch and rebuild the UI even when only Go code changed. Consider either:

  • making scrapeui-build a file/marker target gated on package-lock.json/source mtimes, or
  • keeping scrapeui-build independent and letting dev skip it by default (e.g., dev: build-scrapeui-if-missing), with a separate dev-full that rebuilds the UI.

Also, npm ci wipes node_modules every invocation, which is especially painful for iterative Go work.

♻️ Example: gate on the `dist` artifact
 .PHONY: scrapeui-build
-scrapeui-build:
-	cd cmd/scrapeui/frontend && npm ci && npm run build
+scrapeui-build: cmd/scrapeui/frontend/dist/scrapeui.js
+
+cmd/scrapeui/frontend/dist/scrapeui.js: cmd/scrapeui/frontend/package-lock.json $(shell find cmd/scrapeui/frontend/src -type f 2>/dev/null)
+	cd cmd/scrapeui/frontend && npm ci && npm run build

Also applies to: 220-222

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 48 - 50, The scrapeui-build Makefile target currently
runs "npm ci && npm run build" unconditionally, which slows "make dev"; update
the Makefile so the scrapeui-build target is gated by an artifact or marker
(e.g., check for cmd/scrapeui/frontend/dist or a stamp file) or depend on
package-lock.json (or package.json) so it only rebuilds when frontend sources
change, and remove the unconditional dependency from the dev target (or create
dev-full that depends on scrapeui-build while dev depends on a lighter
build-scrapeui-if-missing); modify the scrapeui-build rule and dev dependency to
use the chosen gating mechanism and avoid running "npm ci" on every invocation.
cmd/scrapeui/frontend/src/components/ScraperList.tsx (1)

7-23: Exhaustive switches without default risk "not all code paths return a value" under strict TS.

Both statusIcon and statusColor are declared to return string but only return from case branches. If ScraperProgress['status'] ever gains a new variant (or is widened to string on the wire, e.g. an unexpected value from the SSE stream), these will return undefined at runtime while the type checker lulls you into false safety. A default arm keeps it future-proof.

♻️ Proposed change
 function statusIcon(status: ScraperProgress['status']): string {
   switch (status) {
     case 'pending': return 'codicon:circle-outline';
     case 'running': return 'svg-spinners:ring-resize';
     case 'complete': return 'codicon:pass-filled';
     case 'error': return 'codicon:error';
+    default: return 'codicon:circle-outline';
   }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/scrapeui/frontend/src/components/ScraperList.tsx` around lines 7 - 23,
The switch statements in statusIcon and statusColor are non-exhaustive and can
return undefined if ScraperProgress['status'] gains variants or an unexpected
value arrives; update both functions (statusIcon and statusColor) to include a
default branch that returns a sensible fallback (e.g., a neutral icon and color)
or throws/asserts for unknown values so the functions always return a string;
reference the functions statusIcon and statusColor and ensure the default
handles unexpected runtime values from the SSE stream to keep TypeScript strict
checks honest.
cmd/run.go (2)

173-175: Drop the commented-out SetLastScrapeSummary(prev) block.

Three-line commented block is dead code — either wire up "previous summary to UI" or remove it to keep the function readable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/run.go` around lines 173 - 175, Remove the three-line commented-out dead
code that references uiServer.SetLastScrapeSummary(prev): delete the commented
block containing "if uiServer != nil { uiServer.SetLastScrapeSummary(prev) }"
(or alternatively actually wire prev into uiServer via SetLastScrapeSummary if
that behavior is required); ensure no other references to the commented lines
remain in the surrounding function so the code stays clean and readable.

638-661: Use http.Server with timeouts and drop the 100 ms sleep.

  • http.Serve(listener, ...) uses the zero-value server with no Read/Write/Idle timeouts, which is a known slow-loris footgun. Even bound to localhost, prefer an explicit &http.Server{Handler: ..., ReadHeaderTimeout: ...} — also needed to cleanly shut it down (you have no handle to Close() otherwise).
  • time.Sleep(100 * time.Millisecond) on line 657 is unnecessary: net.Listen has already bound and the kernel will queue incoming connections before http.Serve's Accept loop runs. The sleep just delays the logger.Infof / openBrowser call.
  • http.Serve error is swallowed via //nolint:errcheck; at minimum log unexpected returns (ignoring http.ErrServerClosed).
♻️ Proposed change
-	go http.Serve(listener, srv.Handler()) //nolint:errcheck
-
-	time.Sleep(100 * time.Millisecond)
-	logger.Infof("Scrape UI at %s", url)
+	httpSrv := &http.Server{
+		Handler:           srv.Handler(),
+		ReadHeaderTimeout: 10 * time.Second,
+	}
+	go func() {
+		if err := httpSrv.Serve(listener); err != nil && !errors.Is(err, http.ErrServerClosed) {
+			logger.Warnf("scrape UI server exited: %v", err)
+		}
+	}()
+	logger.Infof("Scrape UI at %s", url)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/run.go` around lines 638 - 661, Replace the use of the zero-value
http.Serve and the 100ms sleep in startScrapeUI by creating an explicit
http.Server (set ReadHeaderTimeout/ReadTimeout/WriteTimeout/IdleTimeout) with
Handler set to srv.Handler(), start server.Serve(listener) in a goroutine that
logs any non-nil errors except http.ErrServerClosed, and keep the server
reference so callers can Close/Shutdown it later; remove the time.Sleep before
logger.Infof/openBrowser since net.Listen already bound the port; ensure you
still handle the listener fallback logic (uiPort -> "localhost:0") and obtain
port from listener.Addr() as before.
cmd/scrapeui/frontend/src/components/JsonView.tsx (1)

12-14: null and undefined are indistinguishable in the rendered output.

Both branches return "null". For debugging scrape data (where a missing field vs. explicit null can matter), consider rendering undefined as "undefined". Not a functional bug, just a readability nit.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/scrapeui/frontend/src/components/JsonView.tsx` around lines 12 - 14, The
JsonView component currently renders both null and undefined as the same string;
update the conditional branch that checks `data === null || data === undefined`
so it distinguishes the two: keep the existing return for `data === null` (span
showing "null") and add a separate branch for `data === undefined` that returns
a visually distinct span (e.g., class "text-gray-400 italic") showing
"undefined"; locate this logic in the JsonView component and change the
conditionals around the `data` variable to return different text for null vs
undefined.
cmd/scrapeui/frontend/src/components/Summary.tsx (1)

4-21: Unused startedAt prop.

startedAt is declared in Props (line 7) but never destructured or used in Summary. Either consume it (e.g. to compute elapsed internally) or remove it from the interface and its call site to avoid dead surface area.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/scrapeui/frontend/src/components/Summary.tsx` around lines 4 - 21, The
Props interface declares startedAt but Summary never uses it; either remove
startedAt from Props and all call sites to eliminate dead API surface, or update
Summary to accept and destructure startedAt (e.g., function Summary({ counts,
saveSummary, done, elapsed, startedAt }: Props)) and compute elapsed internally
(for example set const computedElapsed = typeof elapsed === 'number' ? elapsed :
Date.now() - startedAt and use computedElapsed) and remove the separate elapsed
prop where appropriate; update callers to pass only the chosen data model
(either pass startedAt and stop passing elapsed, or stop passing startedAt if
you choose to remove it).
cmd/scrapeui/frontend/src/hooks/useRoute.ts (1)

58-71: Avoid side effects inside the state updater.

history.pushState is called from within the setRoute updater, which React can invoke more than once (e.g., under Strict Mode or concurrent rendering). The updater should be pure. Compute the merged route outside and perform the pushState beside the state set so it runs exactly once per navigate call.

♻️ Suggested refactor
   const navigate = useCallback((next: Partial<Route>) => {
-    setRoute(prev => {
-      const merged: Route = {
-        tab: next.tab ?? prev.tab,
-        id: 'id' in next ? next.id : prev.id,
-        q: 'q' in next ? next.q : prev.q,
-      };
-      const path = buildPath(merged);
-      if (location.pathname + location.search !== path) {
-        history.pushState(null, '', path);
-      }
-      return merged;
-    });
-  }, []);
+    setRoute(prev => {
+      const merged: Route = {
+        tab: next.tab ?? prev.tab,
+        id: 'id' in next ? next.id : prev.id,
+        q: 'q' in next ? next.q : prev.q,
+      };
+      const path = buildPath(merged);
+      if (location.pathname + location.search !== path) {
+        history.pushState(null, '', path);
+      }
+      return merged;
+    });
+  }, []);

(Or keep pure by computing merged with a ref to route and calling pushState + setRoute(merged) outside the updater.)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/scrapeui/frontend/src/hooks/useRoute.ts` around lines 58 - 71, The
navigate function currently performs a side effect (history.pushState) inside
the setRoute state updater; move the merge logic out of the updater so the
updater remains pure: compute the merged Route object by reading current route
values (use the route state value or a ref), build the path with
buildPath(merged), call history.pushState(null, '', path) if the path differs
from location.pathname+location.search, then call setRoute(merged). Update
references to setRoute, navigate, buildPath, history.pushState and location
accordingly so pushState runs exactly once per navigate call and the state
updater only returns the merged route.
cmd/scrapeui/frontend/src/components/HARPanel.tsx (1)

108-119: Empty-state check should consider filtered results.

if (!entries || entries.length === 0) is evaluated before the filtered/sorted output is used, but the table still renders when entries is non-empty and filtered is empty (e.g., search yields no matches). Users see the header row with no body. Consider basing the empty-state on filtered.length (and using a different message for "no matches" vs "no traffic captured").

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/scrapeui/frontend/src/components/HARPanel.tsx` around lines 108 - 119,
The empty-state currently checks entries instead of the filtered result, causing
the table header to render when a search yields no matches; update HARPanel to
base the empty-state on filtered.length (use the useMemo result) and show a
different message when entries is empty vs when filtered.length === 0 (e.g., "No
HTTP traffic captured" vs "No matches for your search"); ensure subsequent
usages of sorted/sort/toggle remain unchanged and that the early return uses the
filtered array produced in HARPanel.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/run.go`:
- Around line 234-269: The code calls printOutput(allResults, lastSummary,
lastSnapshotPair, harCollector, logBuf.String()) twice which causes duplicate
terminal output when uiServer is nil; remove the unconditional call before the
uiServer nil-check and rely on the existing conditional block that calls
printOutput only when uiServer == nil so terminal rendering is suppressed in UI
mode (i.e., delete the early printOutput invocation near the start of the UI
handling block that appears before you call uiServer.SetHAR, SetProperties, and
SetDone).
- Around line 271-276: The UI-mode signal handler currently listens with
signal.Notify and returns immediately on signal, bypassing the program shutdown
hooks (e.g., debug echo server, per-scraper contexts) and leaving the scrape UI
HTTP listener running; replace the direct signal handling so that on signal you
invoke the centralized shutdown flow (call shutdown.Shutdown() or use
shutdown.WaitForSignal() instead of returning) and ensure startScrapeUI
registers the HTTP listener with the shutdown hook chain so the listener is
closed during shutdown; update references around uiServer, signal.Notify,
startScrapeUI, and the per-scraper context cleanup to use the shutdown package’s
functions.

In `@cmd/scrapeui/assets.go`:
- Line 5: The build is failing because the //go:embed directive references
frontend/dist/scrapeui.js which does not exist; ensure that file is present
before Go builds by either (A) updating the build/CI/Make targets to run the
frontend build (cd cmd/scrapeui/frontend && npm ci && npm run build) so
frontend/dist/scrapeui.js is produced prior to invoking go build/test, or (B)
commit a minimal fallback file at cmd/scrapeui/frontend/dist/scrapeui.js (and
unignore that exact path) so the //go:embed target resolves during normal
builds; update the repository CI/Make steps accordingly and keep the //go:embed
line in assets.go unchanged.

In `@cmd/scrapeui/frontend/src/components/AliasList.tsx`:
- Around line 12-14: In AliasList.tsx update the copy button element to be
keyboard-visible and explicitly named: add an aria-label (e.g., aria-label="Copy
alias" or a descriptive name) to the button and extend its classes to include a
focus-visible state so keyboard focus makes it visible (for example add
focus-visible:opacity-100 and a visible focus ring class such as
focus-visible:outline-none focus-visible:ring-2); modify the button in the
AliasList component (the copy button element shown in the diff) to include these
attributes so keyboard users can both see and identify the control.

In `@cmd/scrapeui/frontend/src/components/DetailPanel.tsx`:
- Around line 185-195: Copy-link currently sets url.hash but the app's router
(parseRoute in useRoute.ts) reads location.pathname (and location.search for
query parameters), so change the URL construction in DetailPanel.tsx: instead of
setting url.hash, set url.pathname to the route the router expects for the
configs view (e.g., "/configs") and place the item id into the pathname or
search params that parseRoute recognizes (e.g., url.searchParams.set("id",
encodeURIComponent(item.id)) or embed into pathname if parseRoute expects
/configs/:id), then write that url.toString() to navigator.clipboard.writeText;
update the onClick block around navigator.clipboard.writeText so the copied link
points to a path the router will parse (refer to item.id, DetailPanel.tsx, and
useRoute/parseRoute to confirm exact route shape).

In `@cmd/scrapeui/server.go`:
- Around line 98-124: NewServer currently only stores the shortened ScraperName
for each ScraperProgress, and UpdateScraper matches on that display name causing
collisions; add a stable raw identifier field (e.g. RawName or ID) to the
ScraperProgress struct, populate it from the original name string in NewServer
while still setting Name = ScraperName(name) for display, and change
UpdateScraper to match scrapers[i].RawName (or ID) against the incoming name
before updating status/results so each scraper is uniquely identified by its
original key.
- Around line 215-238: NewStaticServer currently omits several fields from the
provided Snapshot when constructing the Server (so saved snapshots lose tabs
like issues, save_summary, snapshots, scrape_spec, properties/logLevel,
buildInfo, lastSummary). Update NewStaticServer to copy those fields from snap
into the returned Server (e.g. set Server.issues, Server.saveSummary,
Server.snapshots, Server.scrapeSpec, Server.properties/logLevel,
Server.buildInfo, Server.lastSummary) so the static server preserves all
snapshot data; keep existing logic for relationships, configMeta, counts,
startedAt and the created updated channel and done=true behavior.
- Around line 390-391: The HTML template in cmd/scrapeui/server.go currently
loads third-party scripts directly from "https://cdn.tailwindcss.com" and
"https://code.iconify.design", which is unsafe for the Scrape UI origin; replace
those external references by bundling the Tailwind CSS and Iconify JS into the
embedded frontend assets (use your existing embed.FS or asset-serving handler),
update the template to load the local asset paths instead of those URLs, and add
a restrictive Content-Security-Policy header and SRI integrity attributes for
the bundled files in the server response (update the template rendering and the
HTTP handler that serves the embedded assets to serve the bundled files and set
CSP headers).
- Around line 255-259: snapshot() reads s.logBuf without synchronizing against
concurrent writes from the logger (written via MultiWriter), causing a data
race; add a dedicated mutex (e.g., s.logMu sync.RWMutex) to protect s.logBuf,
acquire s.logMu.RLock() in Server.snapshot() when calling s.logBuf.String(), and
ensure all places that write to the buffer (the writer passed to MultiWriter /
the logger) acquire s.logMu.Lock() (or use a small locking io.Writer wrapper
that locks around Write and is used in MultiWriter) so all reads/writes of
s.logBuf are serialized.

---

Minor comments:
In `@cmd/scrapeui/convert.go`:
- Around line 39-51: The addRel function is currently appending relationships
that have an empty ConfigExternalID or RelatedExternalID (and only skips when
both are unresolved); change addRel (the closure that takes UIRelationship, uses
seen map, nameByExternalID, and appends to out) to first drop/return immediately
if rel.ConfigExternalID == "" || rel.RelatedExternalID == "" (i.e., require both
external IDs present), then build the seen key and skip duplicates as before;
continue to fill missing ConfigName/RelatedName from nameByExternalID and append
to out only after these validations so incomplete relationships are never
published.

In `@cmd/scrapeui/frontend/src/App.tsx`:
- Around line 134-145: The search predicate inside the useMemo for filtered
(which filters allConfigs via filterItems and then .filter using lq) is missing
checks for config IDs/external IDs; update the predicate to also match lq
against the config's identifier fields (e.g., c.id and c.external_id or
c.externalId as applicable) so that searching by ID will return the config and
keep tab badges consistent with globalSearch. Add these ID checks alongside the
existing c.name, c.config_type, aliases, labels, tags and
JSON.stringify(c.config) checks in the items = items.filter(...) callback.
- Around line 109-132: The route-sync useEffect currently searches only configs
and can miss synthetic items; update the effect that syncs selection (the
useEffect checking tab === 'configs' and using configs.find) to search
allConfigs instead (use allConfigs.find(...)) and include allConfigs in the
effect dependency array; also ensure when no match is found you call
setSelected(null) (instead of leaving stale selection) so orphaned-*/synthetic
items correctly populate or clear the detail panel.

In `@cmd/scrapeui/frontend/src/components/AccessLogTable.tsx`:
- Around line 98-100: The row expansion state is sticky because rows are keyed
by index (key={i}), so Preact reuses component instances across different
entries; change the key to a stable composite identifier derived from the entry
(instead of i) or lift the open state up to the parent and key by that
identifier. Concretely, in the sorted.map call replace key={i} with a stable key
built from immutable fields of the entry (e.g.,
`${entry.timestamp}-${entry.source}-${entry.level}` or JSON.stringify of a small
subset) or implement a parent-level openState map keyed by that composite id and
pass down open/openToggle props to AccessLogRow so expansion follows the entry,
not the slot.

In `@cmd/scrapeui/frontend/src/components/AccessTable.tsx`:
- Around line 110-112: The rows are keyed by array index which breaks
AccessRow's local open state when sorted or filtered; change the map to use each
entry's unique id (use entry.id or e.id) as the React key instead of the index
(key={i}) so the component state remains tied to the entry; ensure you pass the
same id property from sorted entries when rendering AccessRow.

In `@cmd/scrapeui/frontend/src/components/AnsiHtml.tsx`:
- Around line 28-35: The code currently calls
match[1].split(';').filter(Boolean) which removes an empty parameter list so
ESC[m is treated as a no-op; change the logic in AnsiHtml.tsx (around the codes
variable and the loop over codes) so that an empty params list is treated as
['0'] (or otherwise handle when match[1] === '' to push a reset) and then
process codes normally; ensure the styles array is cleared when encountering
code '0' (as handled in the existing if (code === '0' || code === '') branch)
and do not rely on filter(Boolean) to drop the empty reset token so bare
"\x1b[m" properly resets styles.

In `@cmd/scrapeui/frontend/src/components/ConfigNode.tsx`:
- Around line 31-32: The isNew logic (variable isNew using item.Action and
item.created_at) is over-broad and marks many unchanged configs as "New";
instead make "new" only when the scraper explicitly indicates insertion or when
created_at is after the current scrape start time: update the isNew calculation
in ConfigNode (the isNew assignment referencing item.Action and item.created_at)
to compare item.created_at against the current scrape run start timestamp (pass
in or read a scrapeStartTime/ currentRunStart prop/context) and only set isNew
true when item.Action === 'inserted' OR (no Action AND item.created_at exists
AND new Date(item.created_at) > scrapeStartTime). Ensure you add/propagate the
scrapeStartTime value into ConfigNode where it is rendered so the comparison can
be performed.

In `@cmd/scrapeui/frontend/src/components/DetailPanel.tsx`:
- Around line 147-150: itemChanges currently uses substring matching via
changes.filter(ch => ch.source?.includes(item.id)), which can produce false
positives; update the useMemo to perform exact config matching instead—use the
existing matchesConfig function or check exact identifier fields on each change
(resolved?.config_id, external_id with config_type, or config_id) so you only
include changes that truly belong to the given item (refer to matchesConfig()
for the correct matching logic and replace the includes-based predicate in the
changes.filter call).

In `@cmd/scrapeui/frontend/src/components/EntityTable.tsx`:
- Around line 66-83: The column headers for aliases, groups, and members in
EntityTable are clickable but use useSort with keys that don't exist on Entity
(columnsFor creates keys 'aliases', 'groups', 'members' while actual membership
data lives in resolvedUserGroups and aliases may be an array), causing undefined
sorts; fix by either making those columns non-sortable in columnsFor (remove
cursor-pointer / onClick / SortIcon for entries with key
'aliases'|'groups'|'members') or by projecting sortable values onto each Entity
before calling useSort (e.g., compute aliasSummary or membershipCount for each
entity using resolvedUserGroups/allGroups/allUsers and pass that derived field
into the array given to useSort), and update references to columnsFor and
useSort accordingly so the sort key resolves to a concrete property on the
entities.

In `@cmd/scrapeui/frontend/src/components/HARPanel.tsx`:
- Around line 133-135: The row key uses the loop index causing HARRow local
`open` state to jump when `sorted` order changes; update the map in the tbody to
use a stable key derived from the entry (for example concatenate
`entry.startedDateTime`, `entry.request.url` and `entry.request.method`, or use
the original index from the `entries` source) instead of `i`, so HARRow keeps
identity across re-sorts; locate the `sorted.map((e, i) => <HARRow key={i}
entry={e} />)` call and replace the key with a stable unique string derived from
the `entry` fields or original index.

In `@cmd/scrapeui/frontend/vite.config.ts`:
- Around line 88-115: The middleware registered via server.middlewares.use that
computes id using decodeURIComponent((req.url || '').replace(/^\//, '')) must
guard against malformed percent-encoding: wrap the decodeURIComponent call in a
try/catch (or validate beforehand) inside that request handler, and on catch
(URIError) return a non-500 response (e.g., 400 Bad Request or 404) with a short
message and end the response so the process doesn't crash; update the code
around the id computation in that middleware callback to implement this
handling.

In `@cmd/ui.go`:
- Around line 72-97: The serve block currently launches the server with go
http.Serve(listener, srv.Handler()) ignoring Serve errors, uses a fragile
time.Sleep to assume readiness, and never calls Shutdown on SIGINT/SIGTERM; fix
by creating an http.Server (assign to a variable like srvServer) with Handler
set to srv.Handler(), start it in a goroutine and relay any error into a
channel, remove the time.Sleep and instead block until the server goroutine
signals readiness (or until Listen succeeded) before logging/opening the
browser, and on receiving the OS signal call srvServer.Shutdown(ctx) with a
timeout context and close the listener, then wait for the Serve goroutine to
return and log any non-nil error (other than http.ErrServerClosed) so Serve
failures are surfaced; reference scrapeui.NewStaticServer, srv.Handler(),
listener, http.Serve, openBrowser, and the signal handling logic to locate and
update the code.

---

Nitpick comments:
In `@build/Dockerfile`:
- Line 11: The Dockerfile's builder stage uses the mutable tag
"node:20-bookworm" in the FROM line for the scrapeui-builder stage; replace that
tag with the digest-pinned image (e.g., change FROM node:20-bookworm AS
scrapeui-builder to FROM node@sha256:<appropriate-digest> AS scrapeui-builder)
so the scrapeui-builder stage is immutable and reproducible—ensure you keep the
"AS scrapeui-builder" alias and use the correct sha256 digest for the desired
Node 20 bookworm image.

In `@cmd/run.go`:
- Around line 173-175: Remove the three-line commented-out dead code that
references uiServer.SetLastScrapeSummary(prev): delete the commented block
containing "if uiServer != nil { uiServer.SetLastScrapeSummary(prev) }" (or
alternatively actually wire prev into uiServer via SetLastScrapeSummary if that
behavior is required); ensure no other references to the commented lines remain
in the surrounding function so the code stays clean and readable.
- Around line 638-661: Replace the use of the zero-value http.Serve and the
100ms sleep in startScrapeUI by creating an explicit http.Server (set
ReadHeaderTimeout/ReadTimeout/WriteTimeout/IdleTimeout) with Handler set to
srv.Handler(), start server.Serve(listener) in a goroutine that logs any non-nil
errors except http.ErrServerClosed, and keep the server reference so callers can
Close/Shutdown it later; remove the time.Sleep before logger.Infof/openBrowser
since net.Listen already bound the port; ensure you still handle the listener
fallback logic (uiPort -> "localhost:0") and obtain port from listener.Addr() as
before.

In `@cmd/scrapeui/frontend/src/components/HARPanel.tsx`:
- Around line 108-119: The empty-state currently checks entries instead of the
filtered result, causing the table header to render when a search yields no
matches; update HARPanel to base the empty-state on filtered.length (use the
useMemo result) and show a different message when entries is empty vs when
filtered.length === 0 (e.g., "No HTTP traffic captured" vs "No matches for your
search"); ensure subsequent usages of sorted/sort/toggle remain unchanged and
that the early return uses the filtered array produced in HARPanel.

In `@cmd/scrapeui/frontend/src/components/JsonView.tsx`:
- Around line 12-14: The JsonView component currently renders both null and
undefined as the same string; update the conditional branch that checks `data
=== null || data === undefined` so it distinguishes the two: keep the existing
return for `data === null` (span showing "null") and add a separate branch for
`data === undefined` that returns a visually distinct span (e.g., class
"text-gray-400 italic") showing "undefined"; locate this logic in the JsonView
component and change the conditionals around the `data` variable to return
different text for null vs undefined.

In `@cmd/scrapeui/frontend/src/components/ScraperList.tsx`:
- Around line 7-23: The switch statements in statusIcon and statusColor are
non-exhaustive and can return undefined if ScraperProgress['status'] gains
variants or an unexpected value arrives; update both functions (statusIcon and
statusColor) to include a default branch that returns a sensible fallback (e.g.,
a neutral icon and color) or throws/asserts for unknown values so the functions
always return a string; reference the functions statusIcon and statusColor and
ensure the default handles unexpected runtime values from the SSE stream to keep
TypeScript strict checks honest.

In `@cmd/scrapeui/frontend/src/components/Summary.tsx`:
- Around line 4-21: The Props interface declares startedAt but Summary never
uses it; either remove startedAt from Props and all call sites to eliminate dead
API surface, or update Summary to accept and destructure startedAt (e.g.,
function Summary({ counts, saveSummary, done, elapsed, startedAt }: Props)) and
compute elapsed internally (for example set const computedElapsed = typeof
elapsed === 'number' ? elapsed : Date.now() - startedAt and use computedElapsed)
and remove the separate elapsed prop where appropriate; update callers to pass
only the chosen data model (either pass startedAt and stop passing elapsed, or
stop passing startedAt if you choose to remove it).

In `@cmd/scrapeui/frontend/src/hooks/useRoute.ts`:
- Around line 58-71: The navigate function currently performs a side effect
(history.pushState) inside the setRoute state updater; move the merge logic out
of the updater so the updater remains pure: compute the merged Route object by
reading current route values (use the route state value or a ref), build the
path with buildPath(merged), call history.pushState(null, '', path) if the path
differs from location.pathname+location.search, then call setRoute(merged).
Update references to setRoute, navigate, buildPath, history.pushState and
location accordingly so pushState runs exactly once per navigate call and the
state updater only returns the merged route.

In `@Makefile`:
- Around line 48-50: The scrapeui-build Makefile target currently runs "npm ci
&& npm run build" unconditionally, which slows "make dev"; update the Makefile
so the scrapeui-build target is gated by an artifact or marker (e.g., check for
cmd/scrapeui/frontend/dist or a stamp file) or depend on package-lock.json (or
package.json) so it only rebuilds when frontend sources change, and remove the
unconditional dependency from the dev target (or create dev-full that depends on
scrapeui-build while dev depends on a lighter build-scrapeui-if-missing); modify
the scrapeui-build rule and dev dependency to use the chosen gating mechanism
and avoid running "npm ci" on every invocation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cf4a47af-c3ed-4670-bdc2-fe3538d6c67c

📥 Commits

Reviewing files that changed from the base of the PR and between d03327a and ac307f4.

⛔ Files ignored due to path filters (1)
  • cmd/scrapeui/frontend/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (37)
  • .gitignore
  • Makefile
  • build/Dockerfile
  • cmd/root.go
  • cmd/run.go
  • cmd/scrapeui/assets.go
  • cmd/scrapeui/convert.go
  • cmd/scrapeui/frontend/.gitignore
  • cmd/scrapeui/frontend/package.json
  • cmd/scrapeui/frontend/src/App.tsx
  • cmd/scrapeui/frontend/src/components/AccessLogTable.tsx
  • cmd/scrapeui/frontend/src/components/AccessTable.tsx
  • cmd/scrapeui/frontend/src/components/AliasList.tsx
  • cmd/scrapeui/frontend/src/components/AnsiHtml.tsx
  • cmd/scrapeui/frontend/src/components/ConfigNode.tsx
  • cmd/scrapeui/frontend/src/components/ConfigTree.tsx
  • cmd/scrapeui/frontend/src/components/DetailPanel.tsx
  • cmd/scrapeui/frontend/src/components/EntityTable.tsx
  • cmd/scrapeui/frontend/src/components/FilterBar.tsx
  • cmd/scrapeui/frontend/src/components/HARPanel.tsx
  • cmd/scrapeui/frontend/src/components/JsonView.tsx
  • cmd/scrapeui/frontend/src/components/ScrapeConfigPanel.tsx
  • cmd/scrapeui/frontend/src/components/ScraperList.tsx
  • cmd/scrapeui/frontend/src/components/SnapshotPanel.tsx
  • cmd/scrapeui/frontend/src/components/SplitPane.tsx
  • cmd/scrapeui/frontend/src/components/Summary.tsx
  • cmd/scrapeui/frontend/src/globals.d.ts
  • cmd/scrapeui/frontend/src/hooks/useRoute.ts
  • cmd/scrapeui/frontend/src/hooks/useSort.tsx
  • cmd/scrapeui/frontend/src/index.tsx
  • cmd/scrapeui/frontend/src/types.ts
  • cmd/scrapeui/frontend/src/utils.ts
  • cmd/scrapeui/frontend/tsconfig.json
  • cmd/scrapeui/frontend/vite.config.ts
  • cmd/scrapeui/server.go
  • cmd/scrapeui/types.go
  • cmd/ui.go
💤 Files with no reviewable changes (1)
  • .gitignore

Comment thread cmd/run.go Outdated
Comment thread cmd/run.go Outdated
Comment thread cmd/scrapeui/assets.go

import _ "embed"

//go:embed frontend/dist/scrapeui.js
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Ensure the embedded bundle exists before all Go builds/tests.

The Go package will not compile unless frontend/dist/scrapeui.js exists, and the current checks are already failing with pattern frontend/dist/scrapeui.js: no matching files found. The Docker production path builds it, but the normal test/build path needs the same prerequisite or a tracked fallback asset.

🛠️ Possible fixes
-//go:embed frontend/dist/scrapeui.js
+// Keep CI/local Go builds green by ensuring this file is generated before
+// any `go test ./...` or `go build` that includes cmd/scrapeui.
+//go:embed frontend/dist/scrapeui.js
 var bundleJS string

Also update the relevant CI/Make targets to run the frontend build first, for example:

cd cmd/scrapeui/frontend
npm ci
npm run build

Alternatively, commit a minimal cmd/scrapeui/frontend/dist/scrapeui.js fallback and unignore that exact file.

🧰 Tools
🪛 GitHub Check: test

[failure] 5-5:
pattern frontend/dist/scrapeui.js: no matching files found

🪛 GitHub Check: test-prod

[failure] 5-5:
pattern frontend/dist/scrapeui.js: no matching files found

🪛 GitHub Check: test-slim

[failure] 5-5:
pattern frontend/dist/scrapeui.js: no matching files found

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/scrapeui/assets.go` at line 5, The build is failing because the
//go:embed directive references frontend/dist/scrapeui.js which does not exist;
ensure that file is present before Go builds by either (A) updating the
build/CI/Make targets to run the frontend build (cd cmd/scrapeui/frontend && npm
ci && npm run build) so frontend/dist/scrapeui.js is produced prior to invoking
go build/test, or (B) commit a minimal fallback file at
cmd/scrapeui/frontend/dist/scrapeui.js (and unignore that exact path) so the
//go:embed target resolves during normal builds; update the repository CI/Make
steps accordingly and keep the //go:embed line in assets.go unchanged.

Comment thread cmd/scrapeui/frontend/src/components/AliasList.tsx
Comment thread cmd/scrapeui/frontend/src/components/DetailPanel.tsx
Comment thread cmd/scrapeui/server.go Outdated
Comment thread cmd/scrapeui/server.go
Comment thread cmd/scrapeui/server.go
Comment thread cmd/scrapeui/server.go Outdated
Review feedback surfaced several regressions in the new scrape UI path: duplicate terminal output, copied links that did not route correctly, scraper status collisions by display name, and static snapshot views dropping saved fields. The run command also bypassed centralized shutdown hooks in UI mode, and reading logs from bytes.Buffer could race with concurrent writes.

This change fixes the UI/run flow by printing output only once, routing copied config links via pathname, preserving stable scraper identities, carrying full snapshot metadata in static mode, and invoking shutdown hooks when UI mode receives signals. It also wraps log buffering in a concurrency-safe writer and registers UI server shutdown hooks.

To keep build/lint green in environments that do not build frontend assets first, embed fallbacks are added for dist JS/CSS and make resources now guarantees those files exist before typecheck/golangci-lint. Accessibility improvements were also applied to icon-only copy controls.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant