Skip to content

Commit 8a0a084

Browse files
Dumbrisclaude
andcommitted
feat(security): add data flow security with agent hook integration (Spec 027)
Detect and prevent data exfiltration by tracking how data flows between internal tools (Read, databases) and external tools (WebFetch, Slack). Operates in two modes: proxy-only (universal, any agent) and full mode with agent hook integration for intercepting agent-internal tool calls. Key components: - Tool/server classifier with internal/external/hybrid/unknown categories - Content hasher using SHA256 per-field extraction for flow matching - Flow tracker with session-scoped origin recording and edge detection - Policy evaluator with configurable actions (allow/warn/ask/deny) - Session correlator linking agent hook sessions to MCP proxy sessions - Hook CLI commands (install/uninstall/status/evaluate) for Claude Code - POST /api/v1/hooks/evaluate REST endpoint - Activity logging for hook_evaluation and flow_summary event types - Web UI nudge system for hook installation when in proxy-only mode - E2E tests for both proxy-only and hook-enhanced flow detection Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 29cf86e commit 8a0a084

55 files changed

Lines changed: 9829 additions & 11 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CLAUDE.md

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,17 @@ mcpproxy doctor # Run health checks
6161

6262
See [docs/cli-management-commands.md](docs/cli-management-commands.md) for complete reference.
6363

64+
### Hook Integration CLI (Spec 027)
65+
```bash
66+
mcpproxy hook install --agent claude-code # Install hooks (project scope)
67+
mcpproxy hook install --agent claude-code --scope user # Install hooks (user scope)
68+
mcpproxy hook uninstall --agent claude-code # Remove hooks
69+
mcpproxy hook status --agent claude-code # Check hook installation status
70+
mcpproxy hook evaluate --event PreToolUse # Evaluate tool call (reads JSON from stdin)
71+
```
72+
73+
Hooks enable full data flow security by intercepting agent-internal tool calls (Read, Write, Bash, etc.) that the MCP proxy cannot see directly.
74+
6475
### Activity Log CLI
6576
```bash
6677
mcpproxy activity list # List recent activity
@@ -105,6 +116,7 @@ See [docs/cli-output-formatting.md](docs/cli-output-formatting.md) for complete
105116
| `internal/storage/` | BBolt database |
106117
| `internal/management/` | Centralized server management |
107118
| `internal/oauth/` | OAuth 2.1 with PKCE |
119+
| `internal/security/flow/` | Data flow security: classification, tracking, policy |
108120
| `internal/logs/` | Structured logging with per-server files |
109121

110122
See [docs/architecture.md](docs/architecture.md) for diagrams and details.
@@ -194,6 +206,7 @@ See [docs/configuration.md](docs/configuration.md) for complete reference.
194206
| `POST /api/v1/servers/{name}/enable` | Enable/disable server |
195207
| `POST /api/v1/servers/{name}/quarantine` | Quarantine/unquarantine server |
196208
| `GET /api/v1/tools` | Search tools across servers |
209+
| `POST /api/v1/hooks/evaluate` | Evaluate tool call for data flow security |
197210
| `GET /api/v1/activity` | List activity records with filtering |
198211
| `GET /api/v1/activity/{id}` | Get activity record details |
199212
| `GET /api/v1/activity/export` | Export activity records (JSON/CSV) |
@@ -379,6 +392,63 @@ mcpproxy activity export --sensitive-data --output audit.jsonl # Export for com
379392

380393
See [docs/features/sensitive-data-detection.md](docs/features/sensitive-data-detection.md) for complete reference.
381394

395+
## Data Flow Security (Spec 027)
396+
397+
Detects data exfiltration patterns by tracking how data flows between internal tools (Read, databases) and external tools (WebFetch, Slack). Operates in two modes:
398+
399+
- **Proxy-only mode**: Monitors MCP tool calls through the proxy (universal, any agent)
400+
- **Full mode**: Also intercepts agent-internal tools via hooks (requires hook installation)
401+
402+
### Key Concepts
403+
404+
- **Classification**: Tools/servers classified as internal, external, hybrid, or unknown
405+
- **Flow Types**: internal→internal (safe), internal→external (critical), external→internal, external→external
406+
- **Content Hashing**: SHA256 per-field hashing to detect data movement without storing content
407+
- **Session Correlation**: Links agent hook sessions to MCP proxy sessions via argument hash matching
408+
409+
### Configuration
410+
411+
```json
412+
{
413+
"security": {
414+
"flow_tracking": {
415+
"enabled": true,
416+
"session_timeout_minutes": 30,
417+
"max_origins_per_session": 10000,
418+
"hash_min_length": 20
419+
},
420+
"classification": {
421+
"server_overrides": {
422+
"my-private-slack": "internal"
423+
}
424+
},
425+
"flow_policy": {
426+
"internal_to_external": "ask",
427+
"sensitive_data_external": "deny",
428+
"suspicious_endpoints": ["pastebin.com", "webhook.site"]
429+
},
430+
"hooks": {
431+
"enabled": true,
432+
"fail_open": true,
433+
"correlation_ttl_seconds": 5
434+
}
435+
}
436+
}
437+
```
438+
439+
### Key Files
440+
441+
| File | Purpose |
442+
|------|---------|
443+
| `internal/security/flow/classifier.go` | Server/tool classification (internal/external) |
444+
| `internal/security/flow/tracker.go` | Flow session and origin tracking |
445+
| `internal/security/flow/hasher.go` | Content hashing for flow detection |
446+
| `internal/security/flow/service.go` | Flow service orchestrator |
447+
| `internal/security/flow/correlator.go` | Session correlation (hook↔MCP) |
448+
| `internal/security/flow/policy.go` | Policy evaluation engine |
449+
| `internal/httpapi/hooks.go` | POST /api/v1/hooks/evaluate endpoint |
450+
| `cmd/mcpproxy/hook_cmd.go` | Hook CLI commands (install/uninstall/status/evaluate) |
451+
382452
### Exit Codes
383453

384454
| Code | Meaning |
@@ -471,6 +541,8 @@ See `docs/prerelease-builds.md` for download instructions.
471541
- BBolt database (`~/.mcpproxy/config.db`) - ActivityRecord model (024-expand-activity-log)
472542
- Go 1.24 (toolchain go1.24.10) + BBolt (storage), Chi router (HTTP), Zap (logging), regexp (stdlib), existing ActivityService (026-pii-detection)
473543
- BBolt database (`~/.mcpproxy/config.db`) - ActivityRecord.Metadata extension (026-pii-detection)
544+
- Go 1.24 (toolchain go1.24.10) + BBolt (storage), Chi router (HTTP), Zap (logging), mcp-go (MCP protocol), regexp (stdlib), crypto/sha256 (stdlib), existing `security.Detector` (027-data-flow-security)
545+
- BBolt database (`~/.mcpproxy/config.db`) - ActivityRecord.Metadata extension for hook_evaluation type. Flow sessions are in-memory only (not persisted). (027-data-flow-security)
474546

475547
## Recent Changes
476548
- 001-update-version-display: Added Go 1.24 (toolchain go1.24.10)

cmd/mcpproxy/activity_cmd.go

Lines changed: 49 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,8 @@ var (
4242
activityNoIcons bool // Disable emoji icons in output
4343
activityDetectionType string // Spec 026: Filter by detection type (e.g., "aws_access_key")
4444
activitySeverity string // Spec 026: Filter by severity level (critical, high, medium, low)
45+
activityFlowType string // Spec 027: Filter by flow type (e.g., "internal_to_external")
46+
activityRiskLevel string // Spec 027: Filter by risk level (e.g., "critical", "high")
4547

4648
// Show command flags
4749
activityIncludeResponse bool
@@ -72,6 +74,8 @@ type ActivityFilter struct {
7274
SensitiveData *bool // Spec 026: Filter by sensitive data detection
7375
DetectionType string // Spec 026: Filter by detection type
7476
Severity string // Spec 026: Filter by severity level
77+
FlowType string // Spec 027: Filter by flow type
78+
RiskLevel string // Spec 027: Filter by risk level
7579
}
7680

7781
// Validate validates the filter options
@@ -81,6 +85,8 @@ func (f *ActivityFilter) Validate() error {
8185
validTypes := []string{
8286
"tool_call", "policy_decision", "quarantine_change", "server_change",
8387
"system_start", "system_stop", "internal_tool_call", "config_change", // Spec 024: new types
88+
"hook_evaluation", // Spec 027: hook evaluation events
89+
"flow_summary", // Spec 027: flow session summaries
8490
}
8591
// Split by comma for multi-type support
8692
types := strings.Split(f.Type, ",")
@@ -144,6 +150,36 @@ func (f *ActivityFilter) Validate() error {
144150
}
145151
}
146152

153+
// Validate flow_type (Spec 027)
154+
if f.FlowType != "" {
155+
validFlowTypes := []string{"internal_to_internal", "internal_to_external", "external_to_internal", "external_to_external"}
156+
valid := false
157+
for _, ft := range validFlowTypes {
158+
if f.FlowType == ft {
159+
valid = true
160+
break
161+
}
162+
}
163+
if !valid {
164+
return fmt.Errorf("invalid flow-type '%s': must be one of %v", f.FlowType, validFlowTypes)
165+
}
166+
}
167+
168+
// Validate risk_level (Spec 027)
169+
if f.RiskLevel != "" {
170+
validRiskLevels := []string{"none", "low", "medium", "high", "critical"}
171+
valid := false
172+
for _, rl := range validRiskLevels {
173+
if f.RiskLevel == rl {
174+
valid = true
175+
break
176+
}
177+
}
178+
if !valid {
179+
return fmt.Errorf("invalid risk-level '%s': must be one of %v", f.RiskLevel, validRiskLevels)
180+
}
181+
}
182+
147183
// Validate time formats
148184
if f.StartTime != "" {
149185
if _, err := time.Parse(time.RFC3339, f.StartTime); err != nil {
@@ -213,6 +249,13 @@ func (f *ActivityFilter) ToQueryParams() url.Values {
213249
if f.Severity != "" {
214250
q.Set("severity", f.Severity)
215251
}
252+
// Spec 027: Add data flow security filters
253+
if f.FlowType != "" {
254+
q.Set("flow_type", f.FlowType)
255+
}
256+
if f.RiskLevel != "" {
257+
q.Set("risk_level", f.RiskLevel)
258+
}
216259
return q
217260
}
218261

@@ -706,7 +749,7 @@ func init() {
706749
activityCmd.AddCommand(activityExportCmd)
707750

708751
// List command flags
709-
activityListCmd.Flags().StringVarP(&activityType, "type", "t", "", "Filter by type (comma-separated for multiple): tool_call, system_start, system_stop, internal_tool_call, config_change, policy_decision, quarantine_change, server_change")
752+
activityListCmd.Flags().StringVarP(&activityType, "type", "t", "", "Filter by type (comma-separated for multiple): tool_call, system_start, system_stop, internal_tool_call, config_change, policy_decision, quarantine_change, server_change, hook_evaluation, flow_summary")
710753
activityListCmd.Flags().StringVarP(&activityServer, "server", "s", "", "Filter by server name")
711754
activityListCmd.Flags().StringVar(&activityTool, "tool", "", "Filter by tool name")
712755
activityListCmd.Flags().StringVar(&activityStatus, "status", "", "Filter by status: success, error, blocked")
@@ -722,6 +765,9 @@ func init() {
722765
activityListCmd.Flags().Bool("sensitive-data", false, "Filter to show only activities with sensitive data detected")
723766
activityListCmd.Flags().StringVar(&activityDetectionType, "detection-type", "", "Filter by detection type (e.g., aws_access_key, stripe_key)")
724767
activityListCmd.Flags().StringVar(&activitySeverity, "severity", "", "Filter by severity level: critical, high, medium, low")
768+
// Spec 027: Data flow security filters
769+
activityListCmd.Flags().StringVar(&activityFlowType, "flow-type", "", "Filter by data flow type: internal_to_internal, internal_to_external, external_to_internal, external_to_external")
770+
activityListCmd.Flags().StringVar(&activityRiskLevel, "risk-level", "", "Filter by risk level (>= comparison): none, low, medium, high, critical")
725771

726772
// Watch command flags
727773
activityWatchCmd.Flags().StringVarP(&activityType, "type", "t", "", "Filter by type (comma-separated): tool_call, system_start, system_stop, internal_tool_call, config_change, policy_decision, quarantine_change, server_change")
@@ -816,6 +862,8 @@ func runActivityList(cmd *cobra.Command, _ []string) error {
816862
SensitiveData: sensitiveDataPtr,
817863
DetectionType: activityDetectionType,
818864
Severity: activitySeverity,
865+
FlowType: activityFlowType,
866+
RiskLevel: activityRiskLevel,
819867
}
820868

821869
if err := filter.Validate(); err != nil {

cmd/mcpproxy/doctor_cmd.go

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -417,6 +417,17 @@ func displaySecurityFeaturesStatus() {
417417
fmt.Println(" ✗ Sensitive Data Detection: disabled")
418418
fmt.Println(" Enable: set sensitive_data_detection.enabled = true in config")
419419
}
420+
421+
// Data Flow Security status (Spec 027)
422+
secCfg := cfg.GetSecurityConfig()
423+
if secCfg.IsFlowTrackingEnabled() {
424+
fmt.Println(" ✓ Data Flow Security: enabled")
425+
fmt.Println(" Coverage: proxy_only (hooks not installed)")
426+
fmt.Println(" Upgrade: mcpproxy hook install --agent claude-code")
427+
} else {
428+
fmt.Println(" ✗ Data Flow Security: disabled")
429+
fmt.Println(" Enable: set security.flow_tracking.enabled = true in config")
430+
}
420431
}
421432

422433
// formatCategoryList formats a list of categories for display, truncating if too long.

0 commit comments

Comments
 (0)