Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,11 @@ adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
coefficient — eliminates ~200M `Math.cos` calls when decoding a 2000×2000 JPEG
- Performance: `medianFilter` now allocates the four channel-value arrays once outside the pixel
loop instead of per-pixel — reduces GC pressure on large images
- Performance: LZW decoder (`src/utils/lzw.ts`) now uses a growable `Uint8Array` buffer instead of
`number[]` — reduces peak memory by ~9× when decompressing large GIFs
- Performance: TIFF encoder (`encode` and `encodeFrames`) now builds pixel data directly from
compressed `Uint8Array` chunks rather than copying bytes into a `number[]` — reduces peak memory
usage when encoding large TIFF images
- Fixed misleading comment in `adjustHue`: normalization produces 0–360, not −180 to 180
- PNG decoder: 16-bit per-channel images (bitDepth=16) now decode correctly; the pixel stride was
using a fixed 8-bit offset (`x*4`, `x*3`, `x`) causing pixel-offset corruption in 16-bit RGBA,
Expand Down
221 changes: 19 additions & 202 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,185 +5,23 @@ severe. Each entry includes the relevant file(s), a description of the problem,

---

## 🔴 Critical

### C1 — `src/utils/byte_utils.ts:21-24` — `readUint32LE` returns a signed JS integer for values ≥ 0x80000000

JavaScript's bitwise-OR operator always returns a **signed 32-bit integer**. When the high bit of
the fourth byte is set (e.g. a stored value of `0x80000001`), the expression
`data[o] | data[o+1]<<8 | data[o+2]<<16 | data[o+3]<<24` returns a large negative number.

This function is used to read file offsets, chunk sizes, and image dimensions in **BMP, ICO, TIFF,
WebP,** and other decoders. A crafted file can trigger:

- ICO `imageEnd` bounds check passing when `imageEnd` is negative → reads wrong memory region.
- TIFF strip-offset arithmetic producing negative offsets that silently skip data.
- WebP chunk-size loop exiting immediately (negative `pos + chunkSize`), producing an empty image
with no error.

**Fix:** Add `>>> 0` to convert to unsigned: `return (... | data[o+3]<<24) >>> 0;`

---

### C2 — `src/formats/png_base.ts:15-17` — `readUint32` (big-endian) has the same signed-overflow bug

`(data[offset] << 24)` is negative when bit 31 is set. The `png.ts` decode loop uses the returned
`length` to advance `pos` with `pos += length + 12`. A negative `length` keeps `pos < data.length`
true indefinitely, causing a near-infinite parse loop on a 5-byte crafted PNG — an effective
**CPU-based DoS**.

**Fix:** Add `>>> 0`: `return ((data[o]<<24) | ...) >>> 0;`

---

### C3 — `src/utils/lzw.ts:100,122` — `output.push(...entry)` causes a call-stack overflow on adversarial GIF

LZW dictionary entries are spread into function arguments with `output.push(...entry)`. If an
attacker crafts a GIF whose LZW stream triggers a very long repeated run, the spread collapses the
JS call stack with `RangeError: Maximum call stack size exceeded`, **crashing the runtime process**
instead of returning a clean error. The same pattern appears in `gif_decoder.ts:75`.

**Fix:** Replace `output.push(...entry)` with a simple `for` loop:
`for (const b of entry) output.push(b);`

---

## 🟠 High

### H1 — `src/formats/tiff.ts:654-664` — Double-compression produces wrong `StripByteCounts` for deflate

In multi-frame TIFF encoding, pixel data is compressed once in the first loop (lines 582–596) and
written to the output. The second loop (IFD-writing phase) compresses each frame's data **again**
with a fresh encoder/compressor to obtain `StripByteCounts`. For deflate, the OS-level zlib may
produce a different output length on the second call, writing an `StripByteCounts` tag that does not
match the actual strip, making the TIFF **unreadable by most viewers**.

**Fix:** Cache the compressed result from the first loop (e.g. `compressedFrames[i]`) and reuse
`.length` in the second loop.

---

### H2 — `src/formats/ico.ts:145` — Non-integer `actualHeight` causes `validateImageDimensions` to throw for valid ICO files

```ts
validateImageDimensions(width, Math.abs(height) / 2);
```

For an ICO DIB where the stored `height` field is odd (e.g. 5), `Math.abs(5) / 2 = 2.5`.
`validateImageDimensions` checks `Number.isInteger(height)` and throws, rejecting a perfectly
legitimate ICO file. Even when it does not throw, `new Uint8Array(width * 2.5 * 4)` allocates the
wrong buffer size.

**Fix:** Use `Math.floor`: `validateImageDimensions(width, Math.floor(Math.abs(height) / 2));`

---

### H3 — `src/formats/png.ts:59-70` — No bounds check before reading chunk data in `decode()`

The decode loop reads a 4-byte chunk `length` field without first verifying that at least 8 bytes
remain at `pos`. For a truncated PNG, individual byte reads return `undefined`, silently coercing to
`0`. A very large `length` (e.g. `0x7FFFFFFF`) yields `data.slice(pos, pos + length)` returning an
under-sized slice that is silently mis-parsed with no error thrown.

`extractMetadata()` already has the guard `if (pos + 8 > data.length) break;`; `decode()` should
too.

**Fix:** Add `if (pos + 8 > data.length) break;` at the top of the decode loop in `png.ts`.

---

### H4 — `src/utils/webp_decoder.ts:31-34` — Local `readUint32LE` duplicates `byte_utils.ts` and has the same sign bug

`webp_decoder.ts` defines its own `readUint32LE` using bitwise OR, which returns a signed value for
large chunks. The chunk-size loop `pos += 8 + chunkSize` wraps to a large negative on a crafted
WebP, silently exiting the parse loop and returning an empty (invalid) image. Additionally, the
duplicate implementation risks the two copies drifting out of sync.

**Fix:** Remove the local copy and import `readUint32LE` from `../utils/byte_utils.ts`, then apply
the `>>> 0` fix there (C1).

---

## 🟡 Medium

### M1 — `src/formats/pcx.ts:52-54` — Missing `validateImageDimensions` call; unchecked allocation

PCX decode computes `new Uint8Array(height * scanlineLength)` and
`new Uint8Array(width * height * 4)` using dimensions derived directly from the header, with only a
manual `> 0` check. A crafted PCX with `width=65535, height=65535` would attempt to allocate ~17 GB
of memory.

**Fix:** Add `validateImageDimensions(width, height)` after the dimension calculation on lines
49-50, consistent with all other decoders in the project.

---

### M2 — `src/formats/bmp.ts:260-261` — `readUint32LE` used for signed `xPelsPerMeter`/`yPelsPerMeter` fields in `extractMetadata`

The BMP spec defines `biXPelsPerMeter` / `biYPelsPerMeter` as signed `LONG` fields. `decode()`
correctly uses `readInt32LE`, but `extractMetadata()` uses `readUint32LE` for the same fields. For a
BMP with a negative DPI value, `extractMetadata` returns a large positive DPI value, inconsistent
with `decode()`.

**Fix:** Use `readInt32LE` for DPI fields in `extractMetadata`, matching `decode()`.

---

### M3 — `src/utils/jpeg_decoder.ts:961-996` — O(64²) naive IDCT calls `Math.cos` per element

The IDCT implementation calls `Math.cos((2*j+1)*k*Math.PI/16)` for each of the 64 output
coefficients per 8×8 block. For a 2000×2000 JPEG this is ~200 million `Math.cos` evaluations.
Standard JPEG libraries use a precomputed cosine table (or the AAN 1D IDCT) and run 10–100× faster.

**Fix:** Precompute the 64-entry cosine lookup table as a module-level constant and eliminate all
`Math.cos` calls from the hot path.

---

### M4 — `src/utils/image_processing.ts:718-767` — `medianFilter` allocates 4 arrays per pixel in the inner loop

`rValues`, `gValues`, `bValues`, `aValues` are declared as fresh `number[]` inside the pixel loop.
For a 1000×1000 image with `radius=1` this creates 4 million short-lived arrays, causing GC pressure
and significant slowdown.

**Fix:** Declare the four arrays once outside the pixel loop and reset `length = 0` at the start of
each iteration.

---

### M5 — `src/utils/lzw.ts:74` — LZW `decompress` accumulates output in a `number[]`, using ~9× peak memory

The decoder collects decoded bytes in a JS `number[]` (8 bytes per element), then converts to
~~The decoder collects decoded bytes in a JS `number[]` (8 bytes per element), then converts to
`new Uint8Array(output)` at the end. Peak memory is 9× the final output size. For large GIFs this is
significant.

**Fix:** Pre-allocate a `Uint8Array` sized to `width × height` (the expected GIF pixel count) and
write directly into it, or use a `Uint8Array`-backed dynamic buffer.
significant.~~ **Fixed:** `decompress()` now uses a growable `Uint8Array` buffer.

---

### M6 — `src/formats/tiff.ts:275` and `encodeFrames:562` — TIFF encoder accumulates output in a `number[]`

Both single-frame and multi-frame TIFF encoding builds the full output as `number[]` then converts
~~Both single-frame and multi-frame TIFF encoding builds the full output as `number[]` then converts
to `Uint8Array`. For a 4K RGBA image (4096×2160) the intermediate array holds ~35 million JS numbers
(~280 MB), followed by a ~35 MB `Uint8Array`. Peak memory is roughly 9× the final file.

**Fix:** Build the output as a list of `Uint8Array` chunks and concatenate them at the end with a
single `new Uint8Array(totalLength)` copy.

---

### M7 — `src/utils/image_processing.ts:303` — Misleading comment on hue-rotation normalization

```ts
// Normalize rotation to -180 to 180 range
const rotation = ((degrees % 360) + 360) % 360;
```

This normalizes to **0–360**, not −180 to 180 as the comment states. The logic is correct; the
comment is wrong and will confuse maintainers.

**Fix:** Update the comment to: `// Normalize rotation to 0–360 range`
(~280 MB), followed by a ~35 MB `Uint8Array`. Peak memory is roughly 9× the final file.~~ **Fixed:**
Both `encode()` and `encodeFrames()` now build pixel data directly from compressed `Uint8Array`
chunks, with only the small IFD section in a `number[]`.

---

Expand Down Expand Up @@ -230,13 +68,6 @@ potential header-parsing error. An explicit bounds check and warning would aid d

---

### L5 — `src/formats/tiff.ts:656-657` — Dangling redundant encoder instantiation

After fixing H1 (double-compression), the `new TIFFLZWEncoder().compress(frame.data)` call in the
IFD loop will become dead code. Clean it up at that point to avoid confusion.

---

### L6 — `test/` — No adversarial / fuzzing input tests

All tests use well-formed images or programmatically generated valid inputs. There are no tests for
Expand All @@ -257,35 +88,21 @@ enough to stress the call stack. This makes it easy to reintroduce the bug.

### L8 — `docs/` — TIFF multi-frame encode limitations are undocumented

The API docs for `encodeFrames` do not warn that large frame counts with `deflate` compression incur
2× CPU cost (H1), that the full output is built in memory, or that the encoded file may be
unreadable due to the `StripByteCounts` mismatch.
The API docs for `encodeFrames` do not warn about any performance characteristics or limitations of
encoding large TIFF files.

---

## Summary

| ID | File | Lines | Category | Severity |
| -- | ------------------------------- | -------- | ----------------------------- | ------------ |
| C1 | `src/utils/byte_utils.ts` | 21–24 | Security / DoS | **Critical** |
| C2 | `src/formats/png_base.ts` | 15–17 | Security / DoS | **Critical** |
| C3 | `src/utils/lzw.ts` | 100, 122 | Security / DoS | **Critical** |
| H1 | `src/formats/tiff.ts` | 654–664 | Correctness (data corruption) | **High** |
| H2 | `src/formats/ico.ts` | 145 | Correctness | **High** |
| H3 | `src/formats/png.ts` | 59–70 | Correctness | **High** |
| H4 | `src/utils/webp_decoder.ts` | 31–34 | Correctness / Duplication | **High** |
| M1 | `src/formats/pcx.ts` | 52–54 | Security / DoS | **Medium** |
| M2 | `src/formats/bmp.ts` | 260–261 | Correctness | **Medium** |
| M3 | `src/utils/jpeg_decoder.ts` | 961–996 | Performance | **Medium** |
| M4 | `src/utils/image_processing.ts` | 718–767 | Performance | **Medium** |
| M5 | `src/utils/lzw.ts` | 74 | Memory | **Medium** |
| M6 | `src/formats/tiff.ts` | 275, 562 | Memory | **Medium** |
| M7 | `src/utils/image_processing.ts` | 303 | Documentation | **Medium** |
| L1 | `src/formats/png.ts` | — | Correctness | **Low** |
| L2 | `src/formats/pam.ts` | 136–144 | API / Documentation | **Low** |
| L3 | `src/utils/image_processing.ts` | 106–135 | Performance | **Low** |
| L4 | `src/utils/gif_decoder.ts` | 342–344 | Robustness | **Low** |
| L5 | `src/formats/tiff.ts` | 656–657 | Code quality | **Low** |
| L6 | `test/` | — | Test coverage | **Low** |
| L7 | `test/` | — | Test coverage | **Low** |
| L8 | `docs/` | — | Documentation | **Low** |
| ID | File | Lines | Category | Severity |
| -- | ------------------------------- | -------- | ------------------- | ---------- |
| M5 | `src/utils/lzw.ts` | 74 | Memory | **Medium** |
| M6 | `src/formats/tiff.ts` | 275, 562 | Memory | **Medium** |
| L1 | `src/formats/png.ts` | — | Correctness | **Low** |
| L2 | `src/formats/pam.ts` | 136–144 | API / Documentation | **Low** |
| L3 | `src/utils/image_processing.ts` | 106–135 | Performance | **Low** |
| L4 | `src/utils/gif_decoder.ts` | 342–344 | Robustness | **Low** |
| L6 | `test/` | — | Test coverage | **Low** |
| L7 | `test/` | — | Test coverage | **Low** |
| L8 | `docs/` | — | Documentation | **Low** |
Loading
Loading