Skip to content

DoS audit: 13 application-layer vulnerabilities in MCP server#218

Closed
RafaelPo wants to merge 1 commit into
mainfrom
audit-dos
Closed

DoS audit: 13 application-layer vulnerabilities in MCP server#218
RafaelPo wants to merge 1 commit into
mainfrom
audit-dos

Conversation

@RafaelPo

Copy link
Copy Markdown
Contributor

Summary

  • Comprehensive defensive DoS audit of the MCP server (everyrow-mcp/src/everyrow_mcp/)
  • Identified 13 vulnerabilities: 1 Critical, 4 High, 5 Medium, 3 Low
  • Covers Redis memory exhaustion, upload abuse, task queue flooding, connection exhaustion, rate limit bypass, and algorithmic complexity
  • Each finding includes file paths with line numbers, description, proof-of-concept exploit, severity rating, and recommended fix
  • Also documents 14 positive security controls already in place

Key findings

  • Critical: everyrow_progress 12s server-side sleep enables connection exhaustion (DOS-03)
  • High: Body size middleware only covers /api/uploads/, not the MCP endpoint (DOS-09)
  • High: No per-user task submission limit — single user can flood upstream API (DOS-05)
  • High: Result page cache key multiplication via unbounded offset/page_size (DOS-01)
  • High: 50 MB CSV cache per task with no aggregate cap across users (DOS-02)

Test plan

  • Review each finding for accuracy against current codebase
  • Prioritize fixes per the recommended order in the report
  • Implement mitigations starting with Critical/High findings

🤖 Generated with Claude Code

@RafaelPo

Copy link
Copy Markdown
Contributor Author

@claude code review

@github-actions

github-actions Bot commented Feb 25, 2026

Copy link
Copy Markdown
Contributor

Claude finished @RafaelPo's task in 2m 55s —— View job


Code Review: DoS Audit Report

  • Read CLAUDE.md and understand repo conventions
  • Read the audit report (DOS_AUDIT_REPORT.md)
  • Cross-reference findings against actual source code
  • Verify accuracy of file paths, line numbers, and code references
  • Provide review feedback

Overall Assessment

This is a well-structured, thorough defensive DoS audit. I cross-referenced every finding against the actual source code and the report is remarkably accurate — file paths, line numbers, code behavior descriptions, and vulnerability characterizations all check out. Severity ratings are reasonable.

Verification Summary

Finding Files/Lines Accurate? Behavior Accurate? Notes
DOS-01 (Page cache multiplication) Confirmed: raw offset used in cache key at result_store.py:192, not the clamped value from line 190
DOS-02 (CSV cache exhaustion) Confirmed: MAX_CSV_CACHE_CHARS = 50 * 1024 * 1024 at redis_store.py:210, CSV_CACHE_TTL = 3600 at line 28, no aggregate cap
DOS-03 (12s sleep) Confirmed: asyncio.sleep(redis_store.PROGRESS_POLL_DELAY) at tools.py:758, PROGRESS_POLL_DELAY = 12 at redis_store.py:25
DOS-04 (Pandas memory amplification) Confirmed: body = await request.body() at uploads.py:230, then pd.read_csv(BytesIO(body)) at line 291
DOS-05 (No per-user task limit) Confirmed: rate limit in middleware.py:43-91 is IP-only, no per-user task tracking
DOS-06 (Fixed-window burst) Confirmed: window_id = str(int(time.time()) // self._window_seconds) at middleware.py:73
DOS-07 (Inline data amplification) Confirmed: validate_data_size at models.py:148-175 only checks row count; create_model("DynamicInput", **fields) at tools.py:217-219 has no size guard
DOS-08 (OAuth registration flood) ⚠️ ⚠️ Minor: register_client at auth.py:227-235 doesn't directly call _check_rate_limit — enforcement depends on MCP SDK's registration endpoint invoking the rate limiter
DOS-09 (Body size limit scope) Confirmed: path_prefix="/api/uploads/" at middleware.py:109, no override in http_config.py:215-216
DOS-10 (No Redis maxmemory) Confirmed: no maxmemory config in redis_store.py:94-152 or config.py
DOS-11 (Upload rate limit bypass) Confirmed: rate key uses token_hash at uploads.py:269, not user ID or IP
DOS-12 (JSON serialization cost) Confirmed: per-row json.dumps() at result_store.py:68
DOS-13 (Slowloris) Confirmed: no request timeout in routes.py:76-126

Issues Found

1. DOS-08 rate limiting claim needs clarification — The report states OAuth registration has a rate limit of 10/min/IP. While config.py:58-65 defines registration_rate_limit = 10 and registration_rate_window = 60, the register_client method at auth.py:227-235 doesn't itself enforce this. The _check_rate_limit helper is called for "start" and "callback" auth actions but the enforcement path for client registration specifically depends on the MCP SDK's provider interface. This should be clarified — does the SDK enforce this, or is it a gap?

Fix this →

2. Consider adding a "Methodology" section — The report jumps straight into findings. A brief methodology section (tools used, scope boundaries, what was/wasn't tested) would strengthen it as an audit artifact.

3. Positive findings section is valuable — The 14 documented security controls already in place (TTLs, SSRF protection, HMAC-signed uploads, GETDEL for single-use tokens, etc.) all verified as accurate. This is useful context for prioritization.

Minor Suggestions

  • The recommended priority order at the end is well-reasoned. DOS-03 (Critical, cheapest to fix) and DOS-09 (High, simple config change) are correctly identified as top priorities.
  • For DOS-01, the fix suggestion to normalize cache keys using clamped_offset is good — the existing clamped = min(offset, len(all_records)) at result_store.py:190 already computes the right value, it just needs to be used in the cache key at line 192.
  • The PoC scripts are well-written and realistic, though note that several assume direct HTTP access to the MCP endpoint rather than going through the MCP protocol SDK, which would be the more realistic attack vector.

Verdict

Approve with minor comments. The report is high quality — accurate, well-organized, and actionable. The one factual clarification needed (DOS-08 rate limit enforcement path) is minor. This provides a solid roadmap for hardening the MCP server against application-layer DoS.


|

DOS-03: Remove the 12-second asyncio.sleep from everyrow_progress that
held connections open server-side. Polling cadence is now client-driven
via retry_after_seconds in the response text. Removes the connection
exhaustion vector identified in the audit.

DOS-09: Extend BodySizeLimitMiddleware to accept multiple path prefixes
and cover both /api/uploads/ and /mcp endpoints. Previously only upload
paths were protected, leaving the MCP endpoint with no body size limit.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@RafaelPo

Copy link
Copy Markdown
Contributor Author

Closing this PR — removing the server-side sleep without adding per-task poll throttling leaves the endpoint unprotected against rapid polling. The retry_after_seconds hint in the response text is advisory only; a client that ignores it faces no enforcement beyond the global IP rate limit (100 req/60s shared across all endpoints).

A proper fix needs server-side enforcement, e.g. a per-task cooldown key in Redis that rejects progress polls within N seconds of the last one for the same task_id.

@RafaelPo RafaelPo closed this Feb 25, 2026
@RafaelPo RafaelPo deleted the audit-dos branch February 25, 2026 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant