A fully offline, AI-powered static code analysis tool that finds real vulnerabilities in your source code — without ever sending a single line to the cloud.
LSA performs static analysis only — it reads and analyzes your source code, not running applications. It does not perform dynamic testing, network scanning, or runtime exploitation. Think of it as an AI-assisted code review, not a penetration test.
Security matters. But so does privacy.
Many organizations, security researchers, and independent developers are hesitant — or outright unable — to share their source code with cloud-based AI services. Whether it's proprietary business logic, client code under NDA, government projects, or simply a matter of principle, the risk of sending sensitive code to external servers is a dealbreaker.
Existing AI-powered code analysis tools almost always require uploading your code to third-party APIs. That creates a hard choice: get the benefits of AI-driven security analysis, or protect your intellectual property. You shouldn't have to pick one.
LSA was built to eliminate that tradeoff. It runs 100% locally on your machine using any local LLM server, meaning your code never leaves your environment. No cloud APIs. No telemetry. No data sharing. Just you, your code, and a local model doing the heavy lifting.
Beyond privacy, LSA is designed to find real vulnerabilities — not just pattern-match on known bad strings. It uses tree-sitter to parse your code into an AST, builds a call graph, traces actual data flow paths from user inputs to dangerous sinks, then sends structured context to a local LLM that reasons about real exploitability across the full OWASP Top 10.
- 100% Offline — Runs entirely on your machine. Your code never leaves your environment.
- Multi-Language Support — Python, JavaScript, TypeScript, PHP, Go, C#, Java, Ruby, C, and C++.
- Blast-Radius Taint Analysis — Traces source-to-sink data flow paths through the call graph. Only real taint paths are analyzed — not every function in the codebase.
- Real Code Understanding — Tree-sitter AST parsing extracts functions, call graphs, data flow, user inputs, and dangerous sinks.
- Full OWASP Top 10 Coverage — Injection, broken auth, IDOR, SSRF, insecure deserialization, misconfigurations, and more.
- Three-Stage Pipeline — Reconnaissance, Analysis, and Validation. Every finding is verified through a 4-gate validation process with confidence scoring to minimize false positives.
- Secret Detection — Regex-based scanning for 25+ secret types (AWS keys, Stripe, GitHub tokens, JWTs, database URLs, etc.) — no LLM needed, instant results.
- Scope & Diff Filtering — Scan only specific paths (
--scope "src/api") or only changed files (--diff HEAD~1). Essential for large codebases and CI/CD. - Framework Auto-Detection — Automatically detects Flask, Django, FastAPI, Express, Next.js, Laravel, and Rails from project files.
- Custom Rules — Add your own source/sink patterns via
.lsa-rules.jsonwithout modifying the code. - False Positive Suppression — Create a
.lsaignorefile to suppress known false positives by file, line, title, or finding ID. - Parallel & Cached — Multi-threaded LLM calls with configurable workers. Content-hash caching skips unchanged files on re-scans.
- Smart Batching — Small packages are grouped into batches to reduce LLM round trips.
- Multiple Output Formats — Console, Markdown, JSON, and SARIF. Ready for human review or CI/CD integration. See example report.
- Professional Reports — Pentest-quality output with executive summary, OWASP coverage checklist, findings-by-file table, severity breakdown, confidence scores, exploit HTTP requests, and detailed remediation guidance.
- Python 3.10+
- uv package manager
- A local LLM server with an OpenAI-compatible API (see Supported LLM Servers)
# Clone the repository
git clone https://github.com/ab2pentest/local-security-agent.git
cd local-security-agent
# Install dependencies
uv syncLSA works with any server that exposes an OpenAI-compatible /v1/chat/completions endpoint. Pick whichever you prefer:
# llama.cpp
llama-server -m /path/to/your-model.gguf
# Ollama
ollama serve
# Then: export LSA_SERVER_PORT=11434
# LM Studio — just start the server from the UI
# vLLM
vllm serve /path/to/modelLSA auto-detects the model from your server — no configuration needed.
uv run lsa scan /path/to/your/codebaseUsage: lsa [OPTIONS] COMMAND [ARGS]...
Commands:
scan Scan a codebase for security vulnerabilities
recon Run reconnaissance stage only (no LLM needed)
config Show current configuration
# Scan the current directory
uv run lsa scan
# Scan a specific codebase
uv run lsa scan /path/to/codebase
# Full depth scan (all files, not just top 20)
uv run lsa scan /path/to/codebase --depth full
# Verbose mode — see what's happening at each stage
uv run lsa scan /path/to/codebase --verbose
# Specify framework for better analysis context
uv run lsa scan /path/to/codebase --framework flask
# Output as Markdown report
uv run lsa scan /path/to/codebase --format markdown --output report.md
# Output as JSON (great for CI/CD pipelines)
uv run lsa scan /path/to/codebase --format json --output report.json
# Detailed findings with evidence and remediation
uv run lsa scan /path/to/codebase --format markdown --detailed
# Use 8 parallel workers for faster scans
uv run lsa scan /path/to/codebase --workers 8
# Force re-scan (ignore cache)
uv run lsa scan /path/to/codebase --no-cache
# Scan only specific paths (scoped pentest)
uv run lsa scan ./my-app --scope "src/api,src/auth,src/controllers"
# Scan only files changed since main branch (great for PRs)
uv run lsa scan ./my-app --diff main
# Only show high and critical findings
uv run lsa scan ./my-app --min-severity high
# CI/CD pipeline: diff scan, high+ severity, SARIF output, fail on findings
uv run lsa scan ./my-app --diff main --min-severity high --format sarif --output results.sarif --exit-code
# Combine options
uv run lsa scan ./my-app --depth full --verbose --workers 8 --format json --output results.json| Option | Short | Description |
|---|---|---|
--framework |
-f |
Framework hint: flask, fastapi, django, express, unknown (auto-detected if not set) |
--output |
-o |
Write report to file instead of stdout |
--format |
-F |
Output format: console, markdown, json, sarif |
--detailed |
-d |
Include full evidence and remediation details |
--backend |
-b |
LLM backend: local |
--model |
-m |
Model name or path |
--depth |
Scan depth: quick (top 20 files) or full (all files) |
|
--verbose |
-v |
Show detailed progress for each stage |
--workers |
-w |
Number of parallel LLM workers (default: 4) |
--no-cache |
Disable scan cache (re-analyze all files even if unchanged) | |
--scope |
-s |
Comma-separated list of paths to scan (e.g., --scope "src/api,src/auth") |
--diff |
Scan only files changed since a git ref (e.g., --diff HEAD~1, --diff main) |
|
--min-severity |
Only report findings at or above this severity (critical, high, medium, low) |
|
--exit-code |
Exit with code 1 if critical/high findings found (for CI/CD) |
Run just the parsing and risk-scoring stage — no LLM required:
# See what LSA finds in your codebase structure
uv run lsa recon /path/to/codebase
# With verbose output
uv run lsa recon /path/to/codebase --verboseThis is useful for quickly checking how LSA sees your codebase: entry points, function counts, risk scores, taint flows, and import relationships.
# Show current configuration
uv run lsa config| Variable | Description | Default |
|---|---|---|
LSA_BACKEND |
LLM backend | local |
LSA_MODEL |
Model name (auto-detected if not set) | — |
LSA_SERVER_HOST |
LLM server host | localhost |
LSA_SERVER_PORT |
LLM server port | 8080 |
LSA_WORKERS |
Number of parallel LLM workers | 4 |
LSA_DEPTH |
Scan depth: quick or full |
quick |
LSA_CACHE |
Enable scan cache (true/false) |
true |
LSA_TIMEOUT_SECONDS |
Request timeout | 180 |
You can also set these in a .env file in the project root. See .env.example for a template.
Build and run LSA in a container — no Python or uv installation required on your host:
# Build the image
docker build -t lsa .
# Scan a local codebase (mount it to /target)
docker run --rm \
-v $(pwd)/my-webapp:/target \
-e LSA_SERVER_HOST=host.docker.internal \
-e LSA_SERVER_PORT=8080 \
lsa scan /target --depth full
# Save report to your host
docker run --rm \
-v $(pwd)/my-webapp:/target \
-v $(pwd)/reports:/reports \
-e LSA_SERVER_HOST=host.docker.internal \
lsa scan /target --format markdown --output /reports/report.md
# Quick scan with JSON output
docker run --rm \
-v $(pwd)/my-webapp:/target \
-e LSA_SERVER_HOST=host.docker.internal \
lsa scan /target --format json
# SARIF output for GitHub Code Scanning
docker run --rm \
-v $(pwd)/my-webapp:/target \
-v $(pwd)/reports:/reports \
-e LSA_SERVER_HOST=host.docker.internal \
lsa scan /target --format sarif --output /reports/results.sarif
# Recon only (no LLM server needed)
docker run --rm \
-v $(pwd)/my-webapp:/target \
lsa recon /targetImportant notes:
LSA_SERVER_HOST=host.docker.internallets the container reach your LLM server running on the host machine. This works on Docker Desktop (Mac/Windows). On Linux, use--network hostinstead:docker run --rm --network host -v $(pwd)/my-webapp:/target lsa scan /target- The LLM server must be running on your host before starting the scan.
- Mount your codebase as read-only for extra safety:
-v $(pwd)/my-webapp:/target:ro
Use the --exit-code flag to make LSA exit with code 1 when critical or high severity findings are detected — useful as a gate in CI/CD pipelines:
uv run lsa scan ./src --format json --output security-report.json --exit-code
# Exit code 0 = no critical/high findings
# Exit code 1 = critical or high severity issues foundWithout --exit-code, the scan always exits with code 0 after completing the report.
LSA runs a three-stage pipeline with blast-radius taint analysis:
Codebase
|
v
+--------------------------------------+
| Stage 1: Reconnaissance |
| Tree-sitter parses code into ASTs. |
| Extracts functions, calls, inputs, |
| sinks, and decorators. |
| Builds a call graph and traces |
| source-to-sink taint flow paths. |
| Scores each file by risk signals. |
+------------------+-------------------+
|
v
+--------------------------------------+
| Stage 2: Analysis |
| Creates context packages from |
| taint flows (user input -> sink). |
| Batches small packages together. |
| Sends to local LLM in parallel |
| for OWASP vulnerability analysis. |
| Caches results by file content |
| hash for fast re-scans. |
+------------------+-------------------+
|
v
+--------------------------------------+
| Stage 3: Validation |
| Each finding goes through 4 gates: |
| evidence, exploitability, impact, |
| and false-positive check. |
| Checks for existing mitigations |
| (sanitization, parameterized |
| queries, framework protections). |
| Assigns confidence score (0-100). |
| Only confirmed findings with no |
| mitigations make the final report. |
+------------------+-------------------+
|
v
Security Report
(sorted by severity + confidence)
Unlike tools that analyze every function in isolation, LSA builds a call graph and traces actual data flow paths:
- Sources — Functions that receive user input (
$_GET,request.form,req.query, etc.) - Sinks — Functions with dangerous operations (
db.execute(),eval(),shell_exec(), etc.) - Taint flows — Paths through the call graph where user input can reach a dangerous sink
Only functions on a real taint path are sent to the LLM. This means:
- Fewer LLM calls — 65% fewer packages in real-world tests
- Better accuracy — The LLM sees the complete data flow chain, not isolated snippets
- Faster scans — Less time spent analyzing code that can't be exploited
| Language | Extensions |
|---|---|
| Python | .py |
| JavaScript | .js, .jsx |
| TypeScript | .ts, .tsx |
| PHP | .php |
| Go | .go |
| C# | .cs |
| Java | .java |
| Ruby | .rb |
| C | .c, .h |
| C++ | .cpp, .cc, .cxx, .hpp |
LSA works with any server that exposes an OpenAI-compatible /v1/chat/completions endpoint:
| Server | Default Port | Setup |
|---|---|---|
| llama.cpp | 8080 | llama-server -m model.gguf |
| LM Studio | 1234 | Start server from the UI |
| Ollama | 11434 | ollama serve |
| vLLM | 8000 | vllm serve model |
| LocalAI | 8080 | local-ai run model |
| text-generation-webui | 5000 | Enable OpenAI extension |
Recommended models: Instruction-tuned models with 8K+ context and 7B+ parameters.
DVWA benchmark ground truth: 78 true vulnerabilities (verified by manual expert analysis).
| Model | Size | Time | Found | Confirmed | Accuracy | Recommendation |
|---|---|---|---|---|---|---|
| GPT-OSS 20B | 20B | 14 min | 137 | 85 | Best | Recommended — strict validation, low false positives |
| GLM 4.7 Flash | ~9B | 27 min | 118 | 20 | OK | Usable with thinking auto-disabled, but misses many |
| Qwen 3.5 9B | 9B | 35 min | 198 | 32 | OK | Usable with thinking auto-disabled, but slow |
| LFM2 24B | 24B | 11 min | 165 | 152 | Poor | Fastest, but ~50% false positives — confirms everything |
| Llama 3.1 8B | 8B | 11 min | 112 | 0 | N/A | Finds vulns but can't output valid validation JSON |
| Nemotron 4B | 4B | — | 0 | 0 | N/A | Context too small (4K), unusable |
LSA was also tested on real-world HackTheBox challenges with positive results:
- Conversor — found file upload path traversal, XXE, MD5 without salt, missing brute-force protection, CSRF
- Why Lambda — found arbitrary code execution via malicious Keras model upload, unauthenticated API endpoints
Minimum requirements: 8K context window, 7B+ parameters.
Thinking/reasoning models: LSA auto-detects thinking models (Qwen 3, GLM 4, DeepSeek-R1) and automatically disables their reasoning mode for faster, more predictable output. Without this, thinking models waste thousands of tokens on chain-of-thought and can be 4x slower or fail entirely due to timeouts.
What to avoid:
- Models under 7B — too weak for structured JSON security analysis
- Models with less than 8K context — our prompts + code packages require at least 8K tokens
- Models that don't follow JSON instructions well — findings will be dropped by the parser
If your server uses a non-default port, set LSA_SERVER_PORT:
export LSA_SERVER_PORT=11434 # for Ollama
export LSA_SERVER_PORT=1234 # for LM StudioLSA checks for vulnerabilities across the OWASP Top 10 plus additional categories informed by 3K+ real-world HackerOne bug bounty reports:
OWASP Top 10: Injection (SQL, OS command, LDAP, template), Broken Access Control, Cryptographic Failures, Insecure Design, Security Misconfiguration, Vulnerable Components, Authentication Failures, Data Integrity Failures, Logging Failures, SSRF
Additional checks: XSS (reflected, stored, DOM-based), CSRF, File Upload, File Inclusion (LFI/RFI), Path Traversal, Insecure Deserialization, Hardcoded Credentials, Open Redirect, XXE
Logic and design flaws: Brute Force / Missing Rate Limiting, Weak Session IDs, CAPTCHA Bypass, Client-Side Only Security, Insufficient Session Expiration, Clickjacking, Information Disclosure, CRLF Injection, Privacy Violations, Unverified Password Change, Prompt Injection (LLM apps)
Secret detection (regex-based, no LLM): AWS keys, GitHub/GitLab tokens, Slack tokens/webhooks, Stripe keys, Google API keys, Twilio/SendGrid keys, Supabase keys, JWTs, database URLs, private keys, hardcoded passwords, and generic secrets/tokens
Add your own source and sink patterns by creating a .lsa-rules.json file in your project root:
{
"sources": ["custom_framework.get_input(", "my_request.params("],
"sinks": [
{"pattern": "unsafe_render(", "kind": "html_output"},
{"pattern": "run_raw_query(", "kind": "sql"}
],
"skip_dirs": ["generated", "migrations"],
"skip_files": ["config.auto.js"]
}See examples/.lsa-rules.json.example for a full template.
Create a .lsaignore file in your project root to suppress known false positives:
# Ignore by file
test_helpers.py
# Ignore by location
auth.py:42
# Ignore by title (substring match)
CSRF
# Ignore by finding ID from a previous scan
LSA-005
See examples/.lsaignore.example for more examples.
# Run tests
uv run pytest
# If uv run pytest fails on Windows, use:
# uv run python -m pytest
# Run a specific test
uv run pytest tests/test_extractor.py -v
# With coverage
uv run pytest --cov=local_security_agent
# Lint and format
uv run ruff check --fix .
uv run ruff format .LSA is a security analysis aid, not a replacement for manual security review. While it leverages AST parsing, taint analysis, and LLM reasoning to find real vulnerabilities, no automated tool is 100% accurate. It may produce false positives or miss certain vulnerabilities depending on the codebase, the local model used, and the complexity of the code.
Always perform a manual review of any findings before taking action. Use LSA as a starting point to surface potential issues, then verify each one with your own expertise. Treat its output as leads to investigate, not as definitive verdicts.
Contributions are welcome and appreciated! Whether it's a bug fix, new language support, better detection rules, documentation improvements, or entirely new ideas — we'd love to see it.
If you're interested in contributing:
- Fork the repository
- Create a feature branch (
git checkout -b feature/your-feature) - Make your changes
- Run the tests (
uv run pytest) - Submit a pull request
If you're not sure where to start, feel free to open an issue to discuss your idea first.
This project is licensed under the Business Source License 1.1 (BSL 1.1).
- Allowed: Personal use, academic research, non-commercial security testing, internal security audits, open-source contributions.
- Not allowed: Commercial use, including selling, reselling, offering as a service (SaaS), or embedding in commercial products.
If you are interested in using LSA for commercial purposes, please contact us to obtain a commercial license.
See the LICENSE file for the full license text.