Vulfy is built with a modular, async-first architecture designed for performance, reliability, and extensibility.
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β CLI Layer βββββΆβ Core Scanner βββββΆβ OSV.dev API β
β (clap-based) β β (async Rust) β β (HTTP/JSON) β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β β
β βΌ
β ββββββββββββββββββββ
β β Package Parsers β
β β (9 ecosystems) β
β ββββββββββββββββββββ
β β
βΌ βΌ
βββββββββββββββββββ ββββββββββββββββββββ
β Automation β β Reporters β
β System β β (Table/JSON/CSV) β
βββββββββββββββββββ ββββββββββββββββββββ
β
βΌ
βββββββββββββββββββ
β Notifications β
β (Discord/Slack) β
βββββββββββββββββββ
Responsibility: Command-line interface and argument parsing
- Built with
clapfor robust argument parsing - Supports nested subcommands (
scan packages,automation start) - Handles configuration loading and validation
- Manages output formatting and file operations
Key Features:
- Type-safe argument parsing with
clapderive macros - Builder pattern for configuration construction
- Comprehensive error handling and user feedback
Responsibility: Orchestrates the scanning process
- Discovers package files across supported ecosystems
- Manages concurrent scanning with rate limiting
- Coordinates between parsers and vulnerability matching
- Handles recursive directory traversal
Architecture:
pub struct Scanner {
parsers: HashMap<Ecosystem, Box<dyn PackageParser>>,
api_client: OsvClient,
config: ScanConfig,
}Responsibility: Extract package information from manifest files
Each ecosystem has a dedicated parser implementing the PackageParser trait:
pub trait PackageParser: Send + Sync {
async fn parse(&self, file_path: &Path) -> Result<Vec<Package>>;
fn supported_files(&self) -> &[&str];
}Parsers:
- NPM (
npm.rs): package.json, package-lock.json, yarn.lock, pnpm-lock.yaml - Python (
python.rs): requirements.txt, Pipfile.lock, poetry.lock, pyproject.toml - Rust (
rust.rs): Cargo.lock, Cargo.toml - Java (
java.rs): pom.xml, build.gradle, build.gradle.kts - Go (
go.rs): go.mod, go.sum, go.work - Ruby (
ruby.rs): Gemfile.lock, Gemfile, *.gemspec - C++ (
cpp.rs): vcpkg.json, CMakeLists.txt, conanfile.txt - PHP (
php.rs): composer.json, composer.lock - C# (
csharp.rs): *.csproj, packages.config, *.nuspec
Responsibility: Match packages against vulnerability database
- Queries OSV.dev API for vulnerability information
- Implements semantic version comparison using
semvercrate - Handles rate limiting and retry logic
- Filters vulnerabilities based on severity and policies
Key Features:
- Proper semantic version parsing and comparison
- Concurrent API requests with backoff
- CVSS severity parsing and normalization
- Comprehensive error handling
Responsibility: Format and output scan results
Supports multiple output formats:
- Table: Beautiful ASCII tables with color coding
- JSON: Structured data for programmatic use
- CSV: Spreadsheet-compatible format
- SARIF: Static Analysis Results Interchange Format
- Summary: Condensed statistics only
Responsibility: Continuous monitoring and scheduling
Scheduler (scheduler.rs):
- Cron-based job scheduling using
tokio-cron-scheduler - Supports hourly, daily, weekly, and custom schedules
- Manages background task execution
Git Monitor (git_monitor.rs):
- Repository cloning and updates using
git2 - Branch-specific monitoring
- Credential management for private repositories
Policy Engine (policy.rs):
- Advanced vulnerability filtering
- Regex-based pattern matching
- Severity and ecosystem targeting
- Custom notification rules
Webhooks (webhooks.rs):
- Discord, Slack, and generic webhook support
- Rich notification formatting
- Retry logic and error handling
1. CLI parses arguments β ScanConfig
2. Scanner discovers package files
3. Parsers extract package information
4. Matcher queries OSV.dev API
5. Vulnerabilities are filtered and matched
6. Reporter formats and outputs results
1. Scheduler triggers scan job
2. Git Monitor clones/updates repositories
3. Scanner processes each repository
4. Policy Engine filters vulnerabilities
5. Webhooks send notifications
6. Results are stored/exported
- Built on
tokiofor maximum concurrency - Non-blocking I/O operations throughout
- Efficient handling of multiple API requests
- Background task management for automation
- Each ecosystem has a dedicated parser
- Common interface via
PackageParsertrait - Easy to add new ecosystems
- Isolated parsing logic per ecosystem
- Uses
semvercrate for proper version comparison - Handles complex version ranges and constraints
- Fixes critical version comparison bugs from string-based comparison
- Custom error types with
thiserror - Comprehensive error context
- Graceful degradation on parsing failures
- Detailed logging with
tracing
- TOML-based configuration files
- Environment variable overrides
- Validation at load time
- Type-safe configuration structs
- Parallel package file discovery
- Concurrent API requests with rate limiting
- Async I/O operations throughout
- Background automation tasks
- Streaming parsers for large files
- Lazy loading of package information
- Efficient data structures
- Proper resource cleanup
- Configurable concurrent request limits
- Exponential backoff on failures
- Request batching where possible
- Respectful API usage patterns
- Environment variable support for tokens
- SSH key authentication for Git
- Secure credential storage
- No credentials in configuration files
- Comprehensive parsing validation
- Regex pattern validation
- URL validation for webhooks
- Path traversal protection
- HTTPS-only API communication
- Certificate validation
- Timeout handling
- Secure webhook delivery
- Create new parser in
src/scanner/ - Implement
PackageParsertrait - Register parser in scanner module
- Add ecosystem to
types.rs - Update CLI documentation
- Add format variant to
ReportFormatenum - Implement formatting logic in
reporter.rs - Update CLI options
- Add tests and documentation
- Add webhook type to
WebhookTypeenum - Implement formatting in
webhooks.rs - Add configuration options
- Update validation logic
- Parser validation with sample files
- Version comparison edge cases
- Configuration validation
- Error handling scenarios
- End-to-end scan workflows
- API integration testing
- Automation system testing
- Output format validation
- Large project scanning
- Concurrent request handling
- Memory usage validation
- API rate limiting compliance
- tokio: Async runtime and utilities
- reqwest: HTTP client for OSV.dev API
- serde: Serialization/deserialization
- clap: Command-line argument parsing
- semver: Semantic version parsing and comparison
- toml: TOML configuration parsing
- quick-xml: XML parsing for Maven files
- regex: Pattern matching for policies
- walkdir: Recursive directory traversal
- git2: Git operations
- tokio-cron-scheduler: Job scheduling
- chrono: Date/time handling
- uuid: Unique identifier generation
- Database backend for large-scale deployments
- Distributed scanning capabilities
- Caching layer for vulnerability data
- Horizontal scaling support
- Dynamic parser loading
- Custom notification handlers
- Third-party integrations
- Configuration extensions
- Local vulnerability database
- Incremental scanning
- Result caching
- Parallel repository processing
Next: Adding Ecosystems - Guide for supporting new package managers