VulnScanner Architecture Documentation

Overview

VulnScanner is a modular, extensible vulnerability detection system designed for scanning source code repositories. Built with Python 3.11+, it employs a plugin-based architecture for maximum flexibility and maintainability.

System Architecture

┌─────────────────────────────────────────────────────────────────┐
│                         CLI Interface                           │
│                    (scanner command + flags)                    │
└────────────────┬────────────────────────────────────────────────┘
                 │
┌────────────────▼────────────────────────────────────────────────┐
│                      Orchestration Engine                       │
│         (Task scheduling, plugin management, results aggregation)│
└──┬──────────┬─────────┬────────┬──────────┬──────────┬────────┘
   │          │         │        │          │          │
┌──▼──┐  ┌───▼──┐  ┌───▼──┐ ┌──▼───┐  ┌──▼───┐  ┌───▼────┐
│Repo │  │Tech  │  │ SBOM │ │ CVE  │  │ SAST │  │Secrets │
│Clone│  │Detect│  │ Gen  │ │Match │  │Engine│  │Scanner │
└──┬──┘  └───┬──┘  └───┬──┘ └──┬───┘  └──┬───┘  └───┬────┘
   │          │         │        │          │          │
   └──────────┴─────────┴────────┴──────────┴──────────┘
                           │
                  ┌────────▼────────┐
                  │  Plugin System   │
                  │ (Dynamic loader) │
                  └────────┬────────┘
                           │
            ┌──────────────┼──────────────┐
            │              │              │
      ┌─────▼─────┐ ┌─────▼─────┐ ┌─────▼─────┐
      │ Custom    │ │ Community │ │ Third-party│
      │ Plugins   │ │ Plugins   │ │ Integrations│
      └───────────┘ └───────────┘ └───────────┘
                           │
                  ┌────────▼────────┐
                  │ Data Persistence │
                  │  (SQLite/PostgreSQL)│
                  └────────┬────────┘
                           │
            ┌──────────────┼──────────────┐
            │              │              │
      ┌─────▼─────┐ ┌─────▼─────┐ ┌─────▼─────┐
      │   JSON    │ │   SARIF   │ │   HTML    │
      │  Report   │ │  Output   │ │  Report   │
      └───────────┘ └───────────┘ └───────────┘

Core Components

1. CLI Interface (`cli/`)

Purpose: Entry point for user interaction
Technologies: Click framework for command parsing
Commands:
- scan: Main scanning command with configurable flags
- update-advisories: Update CVE/advisory databases
- init-db: Initialize local database
- list-plugins: Show available plugins

2. Orchestration Engine (`core/engine.py`)

Purpose: Coordinates scanning workflow
Responsibilities:
- Task scheduling and parallelization
- Plugin lifecycle management
- Result aggregation and deduplication
- Progress tracking and logging

3. Repository Handler (`core/repo.py`)

Purpose: Repository intake and validation
Features:
- Git clone support (HTTPS/SSH)
- Archive extraction (.zip, .tar.gz)
- Path sanitization
- File filtering and exclusion

4. Technology Detection (`modules/tech_detect.py`)

Purpose: Identify technologies, languages, frameworks
Methods:
- File extension analysis
- Package manifest detection
- Framework fingerprinting
- Build tool identification

5. SBOM Generator (`modules/sbom.py`)

Purpose: Generate Software Bill of Materials
Supports:
- package.json (Node.js)
- requirements.txt/Pipfile (Python)
- go.mod (Go)
- pom.xml (Java)
- Gemfile (Ruby)
- composer.json (PHP)

6. CVE Matcher (`modules/cve_matcher.py`)

Purpose: Map dependencies to known vulnerabilities
Data Sources:
- NVD API
- GitHub Advisory Database
- OSV.dev
- Local cache with periodic updates

7. SAST Engine (`modules/sast/`)

Purpose: Static application security testing
Approach:
- AST-based analysis for supported languages
- Pattern matching for generic detection
- Rule engine for custom checks

8. Secrets Scanner (`modules/secrets.py`)

Purpose: Detect hardcoded secrets and PII
Methods:
- Entropy analysis
- Regular expression patterns
- Contextual validation
- Confidence scoring

9. Plugin System (`core/plugin_manager.py`)

Purpose: Enable extensibility
Features:
- Dynamic plugin loading
- Standard plugin interface
- Dependency injection
- Result schema validation

10. Data Layer (`core/database.py`)

Purpose: Persist scan results and cache
Backends:
- SQLite (default, local)
- PostgreSQL (optional, scalable)
Schema: Normalized tables for findings, scans, plugins

11. Report Generator (`core/reporters/`)

Purpose: Generate various output formats
Formats:
- JSON (machine-readable)
- SARIF (IDE integration)
- HTML (human-readable)
- Markdown (documentation)

Data Flow

Input Stage: User provides repository (path/URL/archive)
Validation: Repository validated and cloned/extracted
Discovery: Technology detection and SBOM generation
Analysis: Parallel execution of scanning modules
Plugin Execution: Custom plugins run with sandboxing
Aggregation: Results collected and deduplicated
Prioritization: Findings scored and ranked
Persistence: Results saved to database
Reporting: Output generated in requested formats

Plugin Interface

class PluginInterface(ABC):
    @abstractmethod
    def scan(self, repo_path: str, metadata: dict) -> List[Finding]:
        """Execute plugin scan logic"""
        pass

    @abstractmethod
    def get_info(self) -> PluginInfo:
        """Return plugin metadata"""
        pass

Security Considerations

Sandboxing: Plugins run in restricted environment
No Exfiltration: All network calls are opt-in
Secret Redaction: Sensitive values masked in reports
Rate Limiting: Advisory API calls throttled
Input Validation: All inputs sanitized

Performance Optimizations

Parallel Processing: Multi-threaded scanning
Caching: Advisory data and intermediate results
Incremental Scanning: Delta analysis for repeat scans
Resource Limits: Configurable timeouts and memory limits

Configuration

Config File: YAML-based configuration (config.yaml)
Environment Variables: Override config values
CLI Flags: Runtime configuration
Plugin Config: Per-plugin settings

Deployment Options

Standalone: Direct Python execution
Docker: Containerized deployment
CI/CD Integration: GitHub Actions, GitLab CI
Cloud: AWS Lambda, Google Cloud Run compatible

Technology Stack

Language: Python 3.11+
CLI Framework: Click
AST Parsing: ast (Python), esprima (JavaScript)
Database: SQLAlchemy ORM
Testing: pytest, coverage
Linting: ruff, mypy
Containerization: Docker

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VulnScanner Architecture Documentation

Overview

System Architecture

Core Components

1. CLI Interface (`cli/`)

2. Orchestration Engine (`core/engine.py`)

3. Repository Handler (`core/repo.py`)

4. Technology Detection (`modules/tech_detect.py`)

5. SBOM Generator (`modules/sbom.py`)

6. CVE Matcher (`modules/cve_matcher.py`)

7. SAST Engine (`modules/sast/`)

8. Secrets Scanner (`modules/secrets.py`)

9. Plugin System (`core/plugin_manager.py`)

10. Data Layer (`core/database.py`)

11. Report Generator (`core/reporters/`)

Data Flow

Plugin Interface

Security Considerations

Performance Optimizations

Configuration

Deployment Options

Technology Stack

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

VulnScanner Architecture Documentation

Overview

System Architecture

Core Components

1. CLI Interface (cli/)

2. Orchestration Engine (core/engine.py)

3. Repository Handler (core/repo.py)

4. Technology Detection (modules/tech_detect.py)

5. SBOM Generator (modules/sbom.py)

6. CVE Matcher (modules/cve_matcher.py)

7. SAST Engine (modules/sast/)

8. Secrets Scanner (modules/secrets.py)

9. Plugin System (core/plugin_manager.py)

10. Data Layer (core/database.py)

11. Report Generator (core/reporters/)

Data Flow

Plugin Interface

Security Considerations

Performance Optimizations

Configuration

Deployment Options

Technology Stack

1. CLI Interface (`cli/`)

2. Orchestration Engine (`core/engine.py`)

3. Repository Handler (`core/repo.py`)

4. Technology Detection (`modules/tech_detect.py`)

5. SBOM Generator (`modules/sbom.py`)

6. CVE Matcher (`modules/cve_matcher.py`)

7. SAST Engine (`modules/sast/`)

8. Secrets Scanner (`modules/secrets.py`)

9. Plugin System (`core/plugin_manager.py`)

10. Data Layer (`core/database.py`)

11. Report Generator (`core/reporters/`)