Skip to content

Latest commit

 

History

History
119 lines (82 loc) · 6.87 KB

File metadata and controls

119 lines (82 loc) · 6.87 KB

Architecture & Thought Process

A short essay on why SecureBot is built the way it is — for the Lzyr Builder Challenge submission.

🎯 Goal Triangulation

The challenge brief says: "We care about execution, creativity, product thinking, speed of shipping, and how you design agent workflows."

I optimised for all five simultaneously instead of picking one to maximise:

  • Execution — End-to-end working product (real LLM mode + real demo mode), not a sketch.
  • Creativity — Two-agent pipeline + tool-anchored LLM is a non-obvious composition.
  • Product thinkingDEMO_MODE=true gives a 60-second zero-setup experience.
  • Speed of shipping — Single Next.js codebase, no Python bridge, deployed in hours.
  • Agent workflow design — Used every GitAgent primitive (SDK, tools both declarative & programmatic, hooks both script & programmatic, skills, workflows, memory, compliance).

🧠 Why a Security Scanner?

Three reasons:

  1. Universally relatable. Every developer immediately gets why this matters.
  2. Demo-friendly. Vulnerabilities are visual: red badges, code diffs, before/after. Better demo than a chat agent.
  3. Showcases agent reasoning. Security needs both fast pattern matching and contextual judgment. Perfect for showing off the tool-anchored LLM pattern.

🏗️ The Two-Agent Pipeline

Scanner Agent ─emits─▶ findings (JSON blocks) ─user clicks─▶ Fixer Agent ─emits─▶ fix (JSON block)

Why two agents instead of one?

  • Separation of concerns. Scanner is read-only and breadth-first; Fixer is targeted and surgical. Different system prompts, different tool sets, different temperatures.
  • Composability. The Scanner's output is structured JSON — anything (UI, CLI, GitHub Action) can consume it. The Fixer can be invoked à la carte.
  • GitAgent native pattern. The framework specifically supports sub-agents, delegation, and skill chaining. This shows I understood the framework, not just the SDK.
  • Trust gates. A human approves each fix individually. Required for a security tool.

This is wired together in agent/workflows/scan-and-fix.yaml — a real GitAgent skill-flow that could be triggered as @scan-and-fix from the CLI.

🪝 The Tool-Anchored LLM Pattern

Pure LLM scanners hallucinate. Pure regex scanners are dumb. The compromise:

Regex anchors  ─▶  LLM reasoning around the anchors  ─▶  structured output

Concretely:

  1. scan_file ships 10 hand-crafted regex rules mapped to CWEs. They run in microseconds and catch the high-confidence cases deterministically.
  2. The LLM sees those findings + the file context and decides: Is this exploitable in this codebase, or a false positive? What's the right fix template?
  3. The LLM is forced to emit findings inside a fenced ```finding block with a strict JSON schema. The frontend parses these live.

This means: fast, cheap, reliable structured output without giving up the contextual smarts of an LLM.

🛡️ Hooks As Defense in Depth

Two layers:

  1. Programmatic preToolUse hook (lib/hooks.ts) — runs in-process, blocks rm -rf, git push --force, writes outside the cwd, etc. Fastest path; no shell out.
  2. Script-based audit.sh hook (agent/hooks/audit.sh) — runs as a child process, appends to memory/audit.jsonl. Survives even if the programmatic hook is bypassed.

Both record every tool invocation. The script-based one demonstrates GitAgent's hook YAML config; the programmatic one demonstrates the SDK API. Showing both was deliberate.

🧬 Why "the agent IS a git repo" matters

GitAgent's central thesis is that an agent's identity (SOUL.md), constraints (RULES.md), abilities (skills/), and memory (memory/) should be version-controlled files. SecureBot embraces this fully:

  • agent/ is a complete, forkable agent repo. Anyone can clone just that directory, point GitAgent at it, and run SecureBot from the CLI.
  • MEMORY.md is committed — the agent's history is auditable.
  • A team could fork SecureBot and git diff the customisations: their RULES, their additional skills, their tool overrides.

This isn't decoration. It's the framework's value prop.

🎨 UX Decisions

  • Live streaming via SSE, not polling. Part of the experience is watching the agent think — tool calls flicker by, findings pop in. Polling would feel dead.
  • Two-pane layout. Agent stream on the left (the how), findings cards on the right (the what). Reviewers can show either pane in a screenshot.
  • Dark theme + GitHub palette. Reviewers from a security background expect this aesthetic.
  • Severity colours match GitHub's. Red/orange/yellow/blue. Familiar.
  • Demo mode is on the landing page. Lowest friction possible — reviewer doesn't need an API key to evaluate.

🚧 What I Skipped (and why)

  • Real local-repo mode. GitAgent supports cloning a repo, working on a branch, and pushing. I scaffolded for it (/api/fix accepts a token) but didn't fully wire it because the demo flow doesn't need a live commit to be impressive — the diff view is the punchline. Trade-off: speed of shipping > completeness.
  • OSV/GHSA live API. check_deps ships an embedded vuln list (8 high-impact CVEs across npm + pypi). Live API integration is a one-liner swap; the architecture is ready.
  • Voice mode. GitAgent has a beautiful voice UI; out of scope for a security demo.
  • Full repo cloning in /api/scan. The demo scans a single representative file. Production would walk the full tree — same code path, just iterated.

These are conscious deferrals, not gaps. The product thinking is "what's the smallest thing that demonstrates the maximum capability."

⏱ Time Budget (~3 hours)

Block Minutes
Researching GitAgent SDK + reading source 25
Designing the agent repo (SOUL, RULES, skills, workflows) 25
Custom tools + hooks (lib/tools/, lib/hooks.ts) 30
SDK wrapper + SSE streaming + demo mode 35
API routes 10
Frontend (layout, landing, scan, components) 40
README + ARCHITECTURE 15

✅ Submission Checklist

  • GitHub repository
  • Architecture diagram (Mermaid in README)
  • Working demo (zero-config via DEMO_MODE=true)
  • Live mode with real GitAgent SDK + custom tools + hooks
  • Multi-agent workflow (Scanner → Fixer chained via workflow YAML)
  • Uses every major GitAgent primitive
  • 3-5 min demo video — to be recorded against the running app

🧪 Stretch Ideas (If More Time)

  • GitHub App webhook → auto-scan PRs → comment with findings
  • One-click "open PR with all fixes applied" button (GitAgent local-repo mode handles this)
  • Memory-driven learning: agent gets better at specific codebases over multiple scans
  • Plugin: publish SecureBot as a GitAgent plugin (gitagent plugin install)
  • Add compose agent that picks scan strategies based on detected stack (Node vs Django vs Go)