Analyze any local workspace or GitHub repository for cryptographic primitives, risky implementations, and secret handling. The backend orchestrates Semgrep rules, LanceDB-powered embeddings, and Groq-hosted LLM enrichment so you can surface issues and auto-generate human-readable reports with minimal setup.
2025-10-04_04-34-32.mp4
- Scan zipped uploads, checked-out folders, or live GitHub repositories with streaming progress.
- AI-assisted enrichment that labels detected primitives, highlights flows, and drafts executive summaries.
- Searchable findings backed by vector embeddings so related matches stay grouped and navigable.
- A Next.js interface for uploads, diffing, and severity triage alongside an HTTP API for automation.
be/– Express API, Semgrep integration, embedding/LanceDB pipeline, Dockerfile.frontend/– Next.js 14 UI with Monaco editor, SSE streaming, Plotly dashboards.data/,models/,JavaCrypto/,scripts/– sample payloads, heuristics, and helper scripts.
- Node.js 20+ and npm (both services are plain Node projects).
- Docker 24+ for containerized runs (optional but recommended for the backend).
- Git and Semgrep CLI are required when you use repository scanning outside the Docker container.
- A Groq API key if you want AI enrichment (
GROQ_API_KEY).
cd becp .env.example .envand edit the values (GROQ_API_KEY, storage paths, etc.).npm installnpm start
The service listens on http://localhost:5050 by default. It stores cached scan JSON in data/scans and LanceDB vectors under lancedb/. Override those locations in .env with CRYPTOSCOPE_DATA_DIR and LANCEDB_DIR if you prefer another volume.
cd frontendcp .env.example .env.local(or.env) and ensureNEXT_PUBLIC_API_BASEpoints at the backend.npm installnpm run dev
The development server runs on http://localhost:3000 and proxies API calls to BACKEND_URL (default http://localhost:5050). Build for production with npm run build followed by npm start.
The backend ships with a production-ready Dockerfile. From the repository root:
docker build -t cryptoscope-be ./be
docker run --rm \
--env-file be/.env \
-p 5050:5050 \
-v $(pwd)/be/data:/var/data/cryptoscope \
-v $(pwd)/be/lancedb:/var/data/lancedb \
cryptoscope-be- Mount the data volumes so cached results survive container restarts.
- When running in Docker you do not need Semgrep or Groq tooling installed on the host; provide the credentials via the
.envfile.
There is no dedicated Dockerfile yet, but you can launch the UI with the stock Node image:
docker run --rm -it \
-p 3000:3000 \
-v $(pwd)/frontend:/app \
-w /app node:20-bookworm \
bash -lc "npm install && npm run dev"Bind-mounting the source keeps hot reloading intact. For production you can swap npm run dev with npm run build && npm start and front an Nginx proxy.
.gitignoreexcludesnode_modules/, runtime data (be/data,be/lancedb), build artifacts (frontend/.next), and all.env*files so keys never leave your machine.- Keep secrets in
.envfiles and commit only the provided*.env.exampletemplates. - For large sample inputs or private corpora, place them under
data/or custom directories and add paths to.gitignorebefore runninggit add.
- Prep environment files with your Groq key and optional embedding overrides.
- Start the backend (
npm startlocally or the Docker container). - Launch the frontend (
npm run dev) and point it at the backend. - Upload a ZIP, paste code, or enter a GitHub URL to kick off analysis.
- Review the enriched findings, search across embeddings, and export reports.
Happy scanning!