From 491bb4fb471d53a068858bfd76eb3c7593fab4d5 Mon Sep 17 00:00:00 2001
From: Sam Bull <git@sambull.org>
Date: Wed, 27 May 2026 01:18:18 +0100
Subject: [PATCH 1/3] Threat model chapter 4

---
 THREAT_MODEL.md | 128 +++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 122 insertions(+), 6 deletions(-)

diff --git a/THREAT_MODEL.md b/THREAT_MODEL.md
index 3b6648685d5..001861fed23 100644
--- a/THREAT_MODEL.md
+++ b/THREAT_MODEL.md
@@ -538,11 +538,127 @@ client-side, the writer adds masks to outgoing frames.
 - No formal CVE has been published against the WebSocket framing layer to
   date.
 
+---
+
+### 5.4. Multipart parsing & encoding
+
+**Scope.** Parsing of incoming `multipart/*` bodies (`MultipartReader`,
+`BodyPartReader`) and construction of outgoing multipart bodies
+(`MultipartWriter`, `FormData`). This includes boundary handling, per-part
+header parsing, `Content-Disposition` extraction, `Content-Transfer-Encoding`
+decoding, charset detection, and the `Request.post()` path that wraps
+`multipart/form-data`. Out of scope: the underlying compression codecs ([§5.5](#55-compression-codecs)),
+streams/payloads ([§5.6](#56-streams--payloads)), and the connection-level read/write timeouts that
+back-stop slow drips ([§5.7](#57-server-connection-lifecycle)).
+
+**Components covered.**
+
+- `aiohttp/multipart.py` — `MultipartReader`, `BodyPartReader`,
+  `MultipartWriter`, `parse_content_disposition`,
+  `content_disposition_filename`.
+- `aiohttp/formdata.py` — `FormData` builder.
+- `aiohttp/web_request.py` — `Request.post()`, `Request.multipart()`.
+- `aiohttp/payload.py` — multipart payload wrapping (overlap with [§5.6](#56-streams--payloads)).
+
+**Trust boundaries & data flow.**
+
+```mermaid
+flowchart LR
+  Wire([Untrusted body]) --> Reader[MultipartReader]
+  Reader --> Parts[BodyPartReader per part]
+  Parts -->|headers, name, filename| App([User handler])
+  Parts -->|streamed body| App
+  Reader -. nested .-> Reader
+
+  AppOut([User code]) --> FD[FormData] --> Writer[MultipartWriter]
+  Writer --> WireOut([Outbound body])
+```
+
+The reader's input is fully attacker-controlled on the server side and
+upstream-controlled on the client side. Per-part values surfaced to user
+code (`headers`, `name`, `filename`, body bytes) are all untrusted. The
+writer's input is trusted in the threat-model sense, but `FormData` is the
+boundary at which user-supplied strings can become wire bytes.
+
+**Assets at risk (chunk-specific).**
+
+- **Process memory** — number of parts, per-part body size, recursion depth
+  for nested multipart.
+- **Filesystem safety in user code** — `filename` round-trips from wire to
+  application without sanitisation.
+- **Outbound framing integrity** — boundary uniqueness, header validity in
+  `FormData.add_field`.
+
+**Threats (STRIDE).**
+
+| # | Component / Vector | STRIDE | Threat | Risk |
+| :--- | :--- | :--- | :--- | :--- |
+| 4.1 | Boundary parameter parsing | T | Malformed boundary parameter (oversized, missing, embedded CR/LF/non-bcharsnospace) could enable multipart parser confusion or smuggling. | Low |
+| 4.2 | Number of parts per body | D | No explicit cap. Each part allocates a `BodyPartReader`. Total bytes are bounded by `client_max_size` (default 1 MiB on the server), but ten-thousand 100-byte parts still fit and grow the live-object count. | Low |
+| 4.3 | Nested multipart recursion | D | `MultipartReader.next()` recurses into nested multiparts without a depth cap; deeply nested input can hit `RecursionError`. `Request.post()` short-circuits this by rejecting any nested multipart it sees, but the bare API does not. | Medium |
+| 4.4 | Per-part header block size | D | Bounded by `max_field_size=8190` and `max_headers=128` (default), plumbed in from the protocol since the Mar 2026 fix. | Low |
+| 4.5 | Per-part body size | D | Streamed; `client_max_size` enforced **during iteration** (Mar 2026 fix), not after buffering. | Low |
+| 4.6 | `Content-Disposition` filename | T / I | Filename is parsed (incl. RFC 2231 `filename*=`) and returned verbatim to the application. Path-traversal strings (`../`), absolute paths, NUL/CR/LF all reach user code unsanitised. | Medium |
+| 4.7 | `Content-Disposition` name | T | Field name returned verbatim. CR/LF in a part's `name` doesn't cross the framing boundary (the multipart parser delimits parts by boundary, not by CR/LF in field names), but downstream loggers / dashboards may misrender. | Low |
+| 4.8 | `_charset_` magic form field | T | An attacker can include a form field named `_charset_` whose value retroactively becomes the *default charset* used to decode subsequent text fields (`multipart.py:MultipartReader.next _charset_ form handling`). Surprising behaviour; documented in the standard. | Low |
+| 4.9 | `Content-Transfer-Encoding` (base64 / quoted-printable) | T / D | Decoders use `base64.b64decode` (chunk-aligned) and `binascii.a2b_qp`; both raise on malformed input. Bounded by per-part size limit. | Low |
+| 4.10 | `Content-Encoding` (gzip/deflate within a part) | D | Supported for non-form-data multiparts; decompression bounded by per-chunk `max_decompress_size` and per-part `client_max_size`. Same backend caveat as PMCE ([§5.5](#55-compression-codecs)) regarding `max_length` overshoot. | Low |
+| 4.11 | Slow-drip / fragmented parts | D | No multipart-level timeout; relies entirely on connection-level read timeouts ([§5.7](#57-server-connection-lifecycle)). A handler that doesn't set a request timeout can be held open indefinitely. | Medium |
+| 4.12 | `MultipartWriter` boundary collision | T | Outbound boundary defaults to `uuid.uuid4().hex` (cryptographically random); the writer does not escape occurrences of the boundary inside part bodies. Collision negligible for the default; real risk only if user supplies a short or predictable boundary. | Low |
+| 4.13 | `FormData.add_field` argument validation | T | `content_type` rejects CR / LF (Feb 2026 fix); `name` and `filename` are *not* validated and pass through to the wire if user code lets attacker-controlled strings reach `add_field`. | Medium |
+| 4.14 | `Request.post()` temp-file lifetime | I / D | Uploaded files are streamed to `tempfile.TemporaryFile()` and lifetime is bound to the `FileField` object. If user code drops the field reference without reading or closing, garbage collection eventually closes the temp file. | Low |
+
+**Mitigations.**
+
+| # | Threat | Existing | Recommended |
+| :--- | :--- | :--- | :--- |
+| 4.1 | Boundary parameter | 70-char cap; missing-boundary raises; HTTP header layer ([§5.1](#51-http1-parser)) catches CR/LF/NUL. | None. |
+| 4.2 | Many small parts | `client_max_size` caps total bytes. | Documented design decision: rely on `client_max_size` rather than introducing a `max_parts` knob. **User**: operators sensitive to live-object count should reduce `client_max_size`. |
+| 4.3 | Nested-multipart recursion | `Request.post()` rejects any nested multipart with `ValueError` ("To decode nested multipart you need to use custom reader") (`web_request.py:BaseRequest.post`). | **Open question.** Direct `MultipartReader` users get unlimited recursion. Consider adding a `max_nesting_depth` parameter (default e.g. 10) to fail cleanly before `RecursionError`. Tracked as a follow-up. |
+| 4.4 | Per-part headers bounded | `max_field_size` / `max_headers` plumbed since 5fe9dfb64 (Mar 2026). | None. |
+| 4.5 | Per-part body bounded | Per-iteration size check since 9cc4b917c (Mar 2026). | None. |
+| 4.6 | Filename path traversal | None; verbatim return is the documented behaviour. | **User**: sanitise `BodyPartReader.filename` before opening / saving / reflecting (strip path components, control bytes, known-bad sequences). **Open question** for aiohttp: ship a public `aiohttp.web.safe_filename()` helper and/or a clearer docstring warning; default behaviour stays as-is. |
+| 4.7 | Field-name pass-through | None. | **User**: `request.post()` keys come from attacker-controlled `name` parameters; do not feed them directly to filesystem / DB / template paths. |
+| 4.8 | `_charset_` magic field | Behaviour follows the HTML5 spec for multipart form charset selection. | Document the surprising behaviour: an attacker who can post any field can change how *subsequent* fields are decoded. **User**: if your form-handling code makes decisions based on string content, be aware that the decode charset is attacker-influenceable. |
+| 4.9 | CTE decoding | Hardened decoders; chunk-aligned base64. | None. |
+| 4.10 | Content-Encoding decompression | Per-chunk `max_decompress_size`, per-part `client_max_size` cap. | Inherits the [§5.5](#55-compression-codecs) caveat about backend `max_length` honouring. |
+| 4.11 | Slow-drip | Connection-level read timeouts in [§5.7](#57-server-connection-lifecycle). | Cross-reference in [§5.7](#57-server-connection-lifecycle). No multipart-layer change. |
+| 4.12 | Outbound boundary collision | UUID4-hex default; HMAC-grade entropy means collision with random body bytes is statistically negligible. | **User**: callers supplying their own boundary must use a random, sufficiently long value. |
+| 4.13 | `FormData` argument injection | `content_type` rejects `\r` / `\n` (`formdata.py:FormData.add_field`, dab9e879b, Feb 2026). | Extend the same check to `name` and `filename` parameters of `add_field`. Recommended hardening (Medium). |
+| 4.14 | Temp-file lifetime | `tempfile.TemporaryFile()` is unlinked at creation on POSIX; closing the FD (whether explicit or via GC) reclaims the disk. | **User**: explicitly close `FileField`s you don't intend to consume to release the temp file promptly under high load. |
+
+**Past advisories / hardening (recap).**
+
+- **GHSA-5m98-qgg9-wh84 (CVE-2024-30251)** (3.9.4) — infinite loop in multipart
+  `_read_chunk_from_length` on a malformed POST body.
+- **GHSA-jj3x-wxrx-4x23 (CVE-2025-69227)** (3.13.3) — `Request.post()` entered an infinite
+  loop on malformed bodies when `PYTHONOPTIMIZE` stripped `assert`
+  statements (asserts were doing real control-flow work in the
+  multipart reader). Fixed by replacing the assertions with explicit
+  checks. Code audit follow-up: any other `assert` used for control
+  flow should be identified and removed.
+- **GHSA-6jhg-hg63-jvvf (CVE-2025-69228)** (3.13.3) — DoS via large
+  `Request.post()` payloads.
+- **GHSA-2vrm-gr82-f7m5 (CVE-2026-34514)** (`dab9e879b`, 3.13.4, Feb 2026) —
+  `FormData.add_field` rejects CR/LF in `content_type` (header
+  injection on outbound multipart).
+- **GHSA-m5qp-6w8w-w647 (CVE-2026-34516)** (`5fe9dfb64`, 3.13.4, Mar
+  2026) — `MultipartReader` now honours `max_field_size` and
+  `max_headers` from the underlying protocol, plugging an unbounded
+  per-part header memory growth path.
+- **GHSA-3wq7-rqq7-wx6j (CVE-2026-34517)** (`9cc4b917c`, 3.13.4, Mar
+  2026) — `Request.post()` enforces `client_max_size` during iteration
+  rather than after buffering, plugging a memory-blow-up on large
+  form fields.
+
+These are all currently in place; this section assumes no regression.
+
 **Open questions.**
 
-1. Should server-side reader reject unmasked frames (and client-side reject
-   masked ones) per RFC 6455 §5.1? (Threat 3.1 — recommended.)
-2. Is the PRNG mask source (`random.getrandbits`) sufficient, or should it be
-   migrated to `secrets`/`os.urandom`? (Threat 3.2.)
-3. For long-lived WebSocket sessions, is there a use case for *forcing*
-   `no_context_takeover` defaults to limit memory growth? (Threat 3.10.)
+1. Should `MultipartReader` accept a `max_nesting_depth` parameter to fail
+   cleanly on deeply nested input (threat 4.3)?
+2. Should aiohttp expose an opt-in `safe_filename()` helper, or improve
+   filename docstrings, to nudge applications away from path-traversal
+   pitfalls (threat 4.6)?
+3. Should `FormData.add_field` extend its CR/LF rejection to `name` and
+   `filename` symmetrically with `content_type` (threat 4.13)?

From 1b3f31cd1813553c863610710b96813576fdf2e0 Mon Sep 17 00:00:00 2001
From: Sam Bull <git@sambull.org>
Date: Wed, 27 May 2026 02:22:21 +0100
Subject: [PATCH 2/3] Tweak

---
 THREAT_MODEL.md | 52 +++++++++++++++++++++++--------------------------
 1 file changed, 24 insertions(+), 28 deletions(-)

diff --git a/THREAT_MODEL.md b/THREAT_MODEL.md
index 001861fed23..a6936580571 100644
--- a/THREAT_MODEL.md
+++ b/THREAT_MODEL.md
@@ -593,20 +593,19 @@ boundary at which user-supplied strings can become wire bytes.
 
 | # | Component / Vector | STRIDE | Threat | Risk |
 | :--- | :--- | :--- | :--- | :--- |
-| 4.1 | Boundary parameter parsing | T | Malformed boundary parameter (oversized, missing, embedded CR/LF/non-bcharsnospace) could enable multipart parser confusion or smuggling. | Low |
-| 4.2 | Number of parts per body | D | No explicit cap. Each part allocates a `BodyPartReader`. Total bytes are bounded by `client_max_size` (default 1 MiB on the server), but ten-thousand 100-byte parts still fit and grow the live-object count. | Low |
+| 4.1 | Boundary parameter parsing | T | Malformed boundary parameter (oversized, missing, or containing bytes outside the RFC 2046 §5.1.1 safe set — digits, letters, and a small punctuation set) could enable multipart parser confusion or smuggling. | Low |
+| 4.2 | Number of parts per body | D | A peer submits a body packed with many tiny parts (e.g. ten thousand 100-byte parts inside a 1 MiB body). Each part allocates a `BodyPartReader` plus header dict, so the live-Python-object footprint is far larger than the on-wire byte count. `client_max_size` caps the wire bytes but not the per-part allocation amplification. | Low |
 | 4.3 | Nested multipart recursion | D | `MultipartReader.next()` recurses into nested multiparts without a depth cap; deeply nested input can hit `RecursionError`. `Request.post()` short-circuits this by rejecting any nested multipart it sees, but the bare API does not. | Medium |
-| 4.4 | Per-part header block size | D | Bounded by `max_field_size=8190` and `max_headers=128` (default), plumbed in from the protocol since the Mar 2026 fix. | Low |
-| 4.5 | Per-part body size | D | Streamed; `client_max_size` enforced **during iteration** (Mar 2026 fix), not after buffering. | Low |
+| 4.4 | Per-part header block size | D | A peer submits a part with an oversized header block (very long field values, or hundreds of headers per part) to drive memory growth at parse time, multiplied across many parts. | Low |
+| 4.5 | Per-part body size | D | A peer submits a single part with a body that grows arbitrarily large before any framing boundary — if size checking happens only after buffering the whole part, memory blows up before the cap fires. | Low |
 | 4.6 | `Content-Disposition` filename | T / I | Filename is parsed (incl. RFC 2231 `filename*=`) and returned verbatim to the application. Path-traversal strings (`../`), absolute paths, NUL/CR/LF all reach user code unsanitised. | Medium |
 | 4.7 | `Content-Disposition` name | T | Field name returned verbatim. CR/LF in a part's `name` doesn't cross the framing boundary (the multipart parser delimits parts by boundary, not by CR/LF in field names), but downstream loggers / dashboards may misrender. | Low |
 | 4.8 | `_charset_` magic form field | T | An attacker can include a form field named `_charset_` whose value retroactively becomes the *default charset* used to decode subsequent text fields (`multipart.py:MultipartReader.next _charset_ form handling`). Surprising behaviour; documented in the standard. | Low |
-| 4.9 | `Content-Transfer-Encoding` (base64 / quoted-printable) | T / D | Decoders use `base64.b64decode` (chunk-aligned) and `binascii.a2b_qp`; both raise on malformed input. Bounded by per-part size limit. | Low |
-| 4.10 | `Content-Encoding` (gzip/deflate within a part) | D | Supported for non-form-data multiparts; decompression bounded by per-chunk `max_decompress_size` and per-part `client_max_size`. Same backend caveat as PMCE ([§5.5](#55-compression-codecs)) regarding `max_length` overshoot. | Low |
-| 4.11 | Slow-drip / fragmented parts | D | No multipart-level timeout; relies entirely on connection-level read timeouts ([§5.7](#57-server-connection-lifecycle)). A handler that doesn't set a request timeout can be held open indefinitely. | Medium |
-| 4.12 | `MultipartWriter` boundary collision | T | Outbound boundary defaults to `uuid.uuid4().hex` (cryptographically random); the writer does not escape occurrences of the boundary inside part bodies. Collision negligible for the default; real risk only if user supplies a short or predictable boundary. | Low |
-| 4.13 | `FormData.add_field` argument validation | T | `content_type` rejects CR / LF (Feb 2026 fix); `name` and `filename` are *not* validated and pass through to the wire if user code lets attacker-controlled strings reach `add_field`. | Medium |
-| 4.14 | `Request.post()` temp-file lifetime | I / D | Uploaded files are streamed to `tempfile.TemporaryFile()` and lifetime is bound to the `FileField` object. If user code drops the field reference without reading or closing, garbage collection eventually closes the temp file. | Low |
+| 4.9 | `Content-Transfer-Encoding` (base64 / quoted-printable) | T / D | A peer sets `Content-Transfer-Encoding: base64` (or `quoted-printable`) and submits malformed encoded data: either the decoder raises and the part is dropped mid-stream, or a quartet that straddles a chunk boundary decodes incorrectly and produces silently corrupted bytes to the handler. | Low |
+| 4.10 | `Content-Encoding` (gzip/deflate within a part) | D | A peer submits a multipart part with `Content-Encoding: gzip` (or `deflate`) carrying a zip-bomb payload — a small compressed body that expands to a vastly larger decoded body, driving memory exhaustion in the receiving handler. | Low |
+| 4.11 | `MultipartWriter` boundary collision | T | If a part body contains a byte sequence matching the boundary string, the writer emits it as-is — a receiver parsing the multipart will see a fake part break early. Negligible against the default `uuid.uuid4().hex` boundary, but real if user code supplies a short or predictable one. | Low |
+| 4.12 | `FormData.add_field` argument injection | T | When user code opts into `FormData(quote_fields=False)`, attacker-controlled `name` or `filename` reach the outbound `Content-Disposition` value with only `\` / `"` escaped — CR/LF/NUL pass through, enabling injection of extra disposition parameters or a fake-part break-out. With the default `quote_fields=True` the values are percent-encoded and safe. `add_field` itself only validates `content_type` (Feb 2026 fix). | Low–Med |
+| 4.13 | `Request.post()` temp-file lifetime | I / D | Uploaded files are streamed to `tempfile.TemporaryFile()` and lifetime is bound to the `FileField` object. If user code drops the field reference without reading or closing, garbage collection eventually closes the temp file. | Low |
 
 **Mitigations.**
 
@@ -619,37 +618,34 @@ boundary at which user-supplied strings can become wire bytes.
 | 4.5 | Per-part body bounded | Per-iteration size check since 9cc4b917c (Mar 2026). | None. |
 | 4.6 | Filename path traversal | None; verbatim return is the documented behaviour. | **User**: sanitise `BodyPartReader.filename` before opening / saving / reflecting (strip path components, control bytes, known-bad sequences). **Open question** for aiohttp: ship a public `aiohttp.web.safe_filename()` helper and/or a clearer docstring warning; default behaviour stays as-is. |
 | 4.7 | Field-name pass-through | None. | **User**: `request.post()` keys come from attacker-controlled `name` parameters; do not feed them directly to filesystem / DB / template paths. |
-| 4.8 | `_charset_` magic field | Behaviour follows the HTML5 spec for multipart form charset selection. | Document the surprising behaviour: an attacker who can post any field can change how *subsequent* fields are decoded. **User**: if your form-handling code makes decisions based on string content, be aware that the decode charset is attacker-influenceable. |
-| 4.9 | CTE decoding | Hardened decoders; chunk-aligned base64. | None. |
+| 4.8 | `_charset_` magic field | Behaviour follows the HTML5 spec for multipart form charset selection. | **Document the surprising behaviour: an attacker who can post any field can change how subsequent fields are decoded.** **User**: if your form-handling code makes decisions based on string content, be aware that the decode charset is attacker-influenceable. |
+| 4.9 | CTE decoding | `BodyPartReader._decode_content_transfer` dispatches to `base64.b64decode` / `binascii.a2b_qp` / pass-through for `binary`/`8bit`/`7bit`, and raises `RuntimeError` on any other token. Both decoders raise on malformed input; decoded output is still subject to the per-part `client_max_size` cap. | **Audit whether `read_chunk` aligns reads on 4-byte boundaries for base64 parts — `base64.b64decode` accepts non-base64 bytes silently (with `validate=False` default), so a mid-quartet chunk split can lose data without raising. If this isn't already aligned, add the alignment or buffer the trailing bytes until the next chunk.** |
 | 4.10 | Content-Encoding decompression | Per-chunk `max_decompress_size`, per-part `client_max_size` cap. | Inherits the [§5.5](#55-compression-codecs) caveat about backend `max_length` honouring. |
-| 4.11 | Slow-drip | Connection-level read timeouts in [§5.7](#57-server-connection-lifecycle). | Cross-reference in [§5.7](#57-server-connection-lifecycle). No multipart-layer change. |
-| 4.12 | Outbound boundary collision | UUID4-hex default; HMAC-grade entropy means collision with random body bytes is statistically negligible. | **User**: callers supplying their own boundary must use a random, sufficiently long value. |
-| 4.13 | `FormData` argument injection | `content_type` rejects `\r` / `\n` (`formdata.py:FormData.add_field`, dab9e879b, Feb 2026). | Extend the same check to `name` and `filename` parameters of `add_field`. Recommended hardening (Medium). |
-| 4.14 | Temp-file lifetime | `tempfile.TemporaryFile()` is unlinked at creation on POSIX; closing the FD (whether explicit or via GC) reclaims the disk. | **User**: explicitly close `FileField`s you don't intend to consume to release the temp file promptly under high load. |
+| 4.11 | Outbound boundary collision | UUID4-hex default; HMAC-grade entropy means collision with random body bytes is statistically negligible. | **User**: callers supplying their own boundary must use a random, sufficiently long value. |
+| 4.12 | `FormData` argument injection | `content_type` rejects `\r` / `\n` at `add_field` (`formdata.py:FormData.add_field`, dab9e879b, Feb 2026). `name` / `filename` are percent-encoded by `content_disposition_header` (`helpers.py:content_disposition_header`) when `quote_fields=True` (default). | **Extend the CR/LF/NUL check to `name` and `filename` at `add_field` so the `quote_fields=False` opt-out path is also defended.** **User**: do not pass `quote_fields=False` when any of `name` / `filename` is attacker-influenced. |
+| 4.13 | Temp-file lifetime | `tempfile.TemporaryFile()` is unlinked at creation on POSIX; closing the FD (whether explicit or via GC) reclaims the disk. | **User**: explicitly close `FileField`s you don't intend to consume to release the temp file promptly under high load. |
 
 **Past advisories / hardening (recap).**
 
 - **GHSA-5m98-qgg9-wh84 (CVE-2024-30251)** (3.9.4) — infinite loop in multipart
   `_read_chunk_from_length` on a malformed POST body.
-- **GHSA-jj3x-wxrx-4x23 (CVE-2025-69227)** (3.13.3) — `Request.post()` entered an infinite
-  loop on malformed bodies when `PYTHONOPTIMIZE` stripped `assert`
+- **GHSA-jj3x-wxrx-4x23 (CVE-2025-69227)** (3.13.3) — `Request.post()` entered
+  an infinite loop on malformed bodies when `PYTHONOPTIMIZE` stripped `assert`
   statements (asserts were doing real control-flow work in the
   multipart reader). Fixed by replacing the assertions with explicit
   checks. Code audit follow-up: any other `assert` used for control
   flow should be identified and removed.
 - **GHSA-6jhg-hg63-jvvf (CVE-2025-69228)** (3.13.3) — DoS via large
   `Request.post()` payloads.
-- **GHSA-2vrm-gr82-f7m5 (CVE-2026-34514)** (`dab9e879b`, 3.13.4, Feb 2026) —
+- **GHSA-2vrm-gr82-f7m5 (CVE-2026-34514)** (3.13.4) —
   `FormData.add_field` rejects CR/LF in `content_type` (header
   injection on outbound multipart).
-- **GHSA-m5qp-6w8w-w647 (CVE-2026-34516)** (`5fe9dfb64`, 3.13.4, Mar
-  2026) — `MultipartReader` now honours `max_field_size` and
-  `max_headers` from the underlying protocol, plugging an unbounded
-  per-part header memory growth path.
-- **GHSA-3wq7-rqq7-wx6j (CVE-2026-34517)** (`9cc4b917c`, 3.13.4, Mar
-  2026) — `Request.post()` enforces `client_max_size` during iteration
-  rather than after buffering, plugging a memory-blow-up on large
-  form fields.
+- **GHSA-m5qp-6w8w-w647 (CVE-2026-34516)** (3.13.4) — `MultipartReader`
+  now honours `max_field_size` and `max_headers` from the underlying
+  protocol, plugging an unbounded per-part header memory growth path.
+- **GHSA-3wq7-rqq7-wx6j (CVE-2026-34517)** (3.13.4) — `Request.post()` enforces
+  `client_max_size` during iteration rather than after buffering,
+  plugging a memory-blow-up on large form fields.
 
 These are all currently in place; this section assumes no regression.
 
@@ -661,4 +657,4 @@ These are all currently in place; this section assumes no regression.
    filename docstrings, to nudge applications away from path-traversal
    pitfalls (threat 4.6)?
 3. Should `FormData.add_field` extend its CR/LF rejection to `name` and
-   `filename` symmetrically with `content_type` (threat 4.13)?
+   `filename` symmetrically with `content_type` (threat 4.12)?

From f2b5297451149a8fb1a24fcd62734724245be2f7 Mon Sep 17 00:00:00 2001
From: Sam Bull <git@sambull.org>
Date: Wed, 27 May 2026 03:18:33 +0100
Subject: [PATCH 3/3] Update

---
 THREAT_MODEL.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/THREAT_MODEL.md b/THREAT_MODEL.md
index a6936580571..9c92234b035 100644
--- a/THREAT_MODEL.md
+++ b/THREAT_MODEL.md
@@ -601,7 +601,7 @@ boundary at which user-supplied strings can become wire bytes.
 | 4.6 | `Content-Disposition` filename | T / I | Filename is parsed (incl. RFC 2231 `filename*=`) and returned verbatim to the application. Path-traversal strings (`../`), absolute paths, NUL/CR/LF all reach user code unsanitised. | Medium |
 | 4.7 | `Content-Disposition` name | T | Field name returned verbatim. CR/LF in a part's `name` doesn't cross the framing boundary (the multipart parser delimits parts by boundary, not by CR/LF in field names), but downstream loggers / dashboards may misrender. | Low |
 | 4.8 | `_charset_` magic form field | T | An attacker can include a form field named `_charset_` whose value retroactively becomes the *default charset* used to decode subsequent text fields (`multipart.py:MultipartReader.next _charset_ form handling`). Surprising behaviour; documented in the standard. | Low |
-| 4.9 | `Content-Transfer-Encoding` (base64 / quoted-printable) | T / D | A peer sets `Content-Transfer-Encoding: base64` (or `quoted-printable`) and submits malformed encoded data: either the decoder raises and the part is dropped mid-stream, or a quartet that straddles a chunk boundary decodes incorrectly and produces silently corrupted bytes to the handler. | Low |
+| 4.9 | `Content-Transfer-Encoding` (base64 / quoted-printable) | T / D | A peer sets `Content-Transfer-Encoding: base64` (or `quoted-printable`) and submits malformed encoded data — the decoder raises mid-stream and the partially-decoded part is surfaced to the handler with an exception. | Low |
 | 4.10 | `Content-Encoding` (gzip/deflate within a part) | D | A peer submits a multipart part with `Content-Encoding: gzip` (or `deflate`) carrying a zip-bomb payload — a small compressed body that expands to a vastly larger decoded body, driving memory exhaustion in the receiving handler. | Low |
 | 4.11 | `MultipartWriter` boundary collision | T | If a part body contains a byte sequence matching the boundary string, the writer emits it as-is — a receiver parsing the multipart will see a fake part break early. Negligible against the default `uuid.uuid4().hex` boundary, but real if user code supplies a short or predictable one. | Low |
 | 4.12 | `FormData.add_field` argument injection | T | When user code opts into `FormData(quote_fields=False)`, attacker-controlled `name` or `filename` reach the outbound `Content-Disposition` value with only `\` / `"` escaped — CR/LF/NUL pass through, enabling injection of extra disposition parameters or a fake-part break-out. With the default `quote_fields=True` the values are percent-encoded and safe. `add_field` itself only validates `content_type` (Feb 2026 fix). | Low–Med |
@@ -619,7 +619,7 @@ boundary at which user-supplied strings can become wire bytes.
 | 4.6 | Filename path traversal | None; verbatim return is the documented behaviour. | **User**: sanitise `BodyPartReader.filename` before opening / saving / reflecting (strip path components, control bytes, known-bad sequences). **Open question** for aiohttp: ship a public `aiohttp.web.safe_filename()` helper and/or a clearer docstring warning; default behaviour stays as-is. |
 | 4.7 | Field-name pass-through | None. | **User**: `request.post()` keys come from attacker-controlled `name` parameters; do not feed them directly to filesystem / DB / template paths. |
 | 4.8 | `_charset_` magic field | Behaviour follows the HTML5 spec for multipart form charset selection. | **Document the surprising behaviour: an attacker who can post any field can change how subsequent fields are decoded.** **User**: if your form-handling code makes decisions based on string content, be aware that the decode charset is attacker-influenceable. |
-| 4.9 | CTE decoding | `BodyPartReader._decode_content_transfer` dispatches to `base64.b64decode` / `binascii.a2b_qp` / pass-through for `binary`/`8bit`/`7bit`, and raises `RuntimeError` on any other token. Both decoders raise on malformed input; decoded output is still subject to the per-part `client_max_size` cap. | **Audit whether `read_chunk` aligns reads on 4-byte boundaries for base64 parts — `base64.b64decode` accepts non-base64 bytes silently (with `validate=False` default), so a mid-quartet chunk split can lose data without raising. If this isn't already aligned, add the alignment or buffer the trailing bytes until the next chunk.** |
+| 4.9 | CTE decoding | `BodyPartReader._decode_content_transfer` dispatches to `base64.b64decode` / `binascii.a2b_qp` / pass-through for `binary`/`8bit`/`7bit`, and raises `RuntimeError` on any other token. Both decoders raise on malformed input; decoded output is still subject to the per-part `client_max_size` cap. For base64, `BodyPartReader.read_chunk` extends each chunk read until its base64-significant character count is a multiple of 4, so mid-quartet splits cannot silently corrupt the decoded output. | None. |
 | 4.10 | Content-Encoding decompression | Per-chunk `max_decompress_size`, per-part `client_max_size` cap. | Inherits the [§5.5](#55-compression-codecs) caveat about backend `max_length` honouring. |
 | 4.11 | Outbound boundary collision | UUID4-hex default; HMAC-grade entropy means collision with random body bytes is statistically negligible. | **User**: callers supplying their own boundary must use a random, sufficiently long value. |
 | 4.12 | `FormData` argument injection | `content_type` rejects `\r` / `\n` at `add_field` (`formdata.py:FormData.add_field`, dab9e879b, Feb 2026). `name` / `filename` are percent-encoded by `content_disposition_header` (`helpers.py:content_disposition_header`) when `quote_fields=True` (default). | **Extend the CR/LF/NUL check to `name` and `filename` at `add_field` so the `quote_fields=False` opt-out path is also defended.** **User**: do not pass `quote_fields=False` when any of `name` / `filename` is attacker-influenced. |