Skip to content

Detect plaintext credentials in MCP config files#44

Open
dfirdeferred wants to merge 1 commit into
perplexityai:mainfrom
dfirdeferred:main
Open

Detect plaintext credentials in MCP config files#44
dfirdeferred wants to merge 1 commit into
perplexityai:mainfrom
dfirdeferred:main

Conversation

@dfirdeferred
Copy link
Copy Markdown

Summary

  • Emits plaintext_credential findings for hardcoded API keys and tokens in MCP config env and headers blocks
  • Credentials are redacted to their identifying prefix (e.g. sk-ant-api***, ghp_***, AKIA***) — the actual secret never appears in the output
  • Each finding includes a remediation message: replace the hardcoded value with ${VAR_NAME} and set the secret in the shell environment or a secrets manager
  • Covers 21 well-known credential prefixes (Anthropic, OpenAI, GitHub, AWS, Google, Slack, Stripe, GitLab, HuggingFace, Replicate, SendGrid) plus a conservative heuristic for unknown
    formats
  • Findings use the same NDJSON finding record format as existing package_exposure findings — no new record types or output formats

Motivation

MCP config files are a growing attack surface as developers configure AI tool integrations. It's common for users to paste API keys directly into env blocks instead of using ${VAR}
references, leaving plaintext credentials on disk. Since bumblebee already scans these files for package inventory, it's a natural place to surface this class of misconfiguration.

Changes

  • internal/model/model.go: new FindingTypePlaintextCredential constant
  • internal/scanner/scanner.go: wires EmitFinding callback to the MCP scanner
  • internal/ecosystem/mcp/mcp.go: credential detection, redaction, and remediation logic; parses headers field
  • internal/ecosystem/mcp/mcp_test.go: 7 new test functions

Test plan

  • Known-prefix detection (Anthropic, OpenAI, GitHub, AWS) with redaction and remediation
  • Heuristic detection (secret key name + high-entropy value)
  • Header credential detection (Authorization, x-api-key)
  • Env-var references (${VAR}) never produce findings
  • Claude dual-scope config (top-level + per-project)
  • Redaction function unit tests
  • All existing tests pass unmodified

  MCP server configurations commonly carry API tokens in their env and
  headers blocks. Best practice is to use environment variable references
  (${VAR_NAME}) so secrets are never written to disk. This change inspects
  env values and authentication headers during MCP config scanning and
  emits plaintext_credential findings when a value appears to be a
  hardcoded secret.

  Findings are emitted in the same NDJSON record stream as package and
  exposure records so receivers can act on them without new parsing logic.
  Each finding includes the credential redacted to its identifying prefix
  (e.g. sk-ant-api***, ghp_***, AKIA***) and a remediation message
  telling the user to replace the value with an env-var reference.

  Detection covers 21 well-known API-key prefixes (Anthropic, OpenAI,
  GitHub, AWS, Google, Slack, Stripe, GitLab, etc.) and applies a
  conservative heuristic for unknown formats: secret-suggesting key name
  paired with a long, high-entropy value. Env-var references, known
  non-secret keys, file paths, and URLs are excluded to minimize false
  positives.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant