Skip to content

HFS+ B-tree fork grow-on-full (§4b A/B/C) + classic-HFS fsck_hfs fix#48

Merged
danifunker merged 3 commits into
mainfrom
hfsplus_tweaks
Jun 29, 2026
Merged

HFS+ B-tree fork grow-on-full (§4b A/B/C) + classic-HFS fsck_hfs fix#48
danifunker merged 3 commits into
mainfrom
hfsplus_tweaks

Conversation

@danifunker

Copy link
Copy Markdown
Owner

Implements the deferred §4b of docs/hfsplus_btree_growth_plan.md (HFS+ B-tree fork grow-on-full), and — because validating it against a real Mac surfaced them — fixes several latent on-disk-format bugs that only an HFS-capable Mac could catch. All acceptance testing was done against real fsck_hfs: modern macOS for HFS+/HFSX, and a Mac OS X 10.4 (Tiger) box for classic HFS.

Commit 1 — feat(hfsplus): §4b grow-on-full (A/B/C)

The catalog / extents-overflow / attributes B-trees now grow their backing fork on demand (grow_btree_fork) instead of returning DiskFull, so an under-sized or foreign catalog fills in place.

  • Phase A — contiguous tail growth (keeps the fork at one extent).
  • Phase B — spills the 9th+ extent into the extents-overflow B-tree, persisted by a new overflow-aware write_fork_data_with_overflow (the write-side mirror of read_fork_with_overflow).
  • Phase C — appends map nodes past the (node_size-256)*8 ≈ 30,720-node header-bitmap cap.
  • Phase 0rb-cli new --fs hfsplus [--case-sensitive] [--min-catalog BYTES] (GUI/CLI parity + drives the macOS recipe).

Two foundational blank-builder bugs fixed along the way (affected every HFS+ image we build from scratch — blanks and the defrag/clone path used by restore + HFS-expand):

  • Invalid BTH length — the B-tree header wrote BTHeaderRec as 128 bytes; Apple requires exactly 106 (record 1 at 120, node bitmap at 248). This also resolves the header-vs-map-node capacity discrepancy.
  • Empty-tree representation — an empty tree was depth-1 with an empty root leaf (Apple uses depth-0 / no leaf), and the bitmap didn't reserve the trailing alternate-VH block. btree_insert_full now bootstraps the root leaf on the first insert into a depth-0 tree, and returns a clean DiskFull (pre-split free_nodes >= depth+1 check) so the grow-retry can't half-apply a split and duplicate a record.

Commit 2 — fix(hfs): classic-HFS filStBlk/filRStBlk must be zero

build_file_record (and write_resource_fork) wrote the fork's first allocation block into filStBlk (offset 24) / filRStBlk (offset 34). Those are the File Manager's in-memory cache and must be 0 on disk — the real location is the extent record at offset 74/86. A non-zero value makes fsck_hfs report "Reserved fields in the catalog record have incorrect data" (E_CatalogFlagsNotZero). Pre-existing; real Mac OS ignored the on-disk value so files worked, and our own fsck never checked it. The reader uses the extent record, so zeroing is safe. Covers HFV too.

Validation

  • HFS+ grow A/B/C + defrag-clone output: fsck_hfs -f -n "appears to be OK" and mountable on macOS.
  • Classic HFS: confirmed on real Tiger 10.4.12 — fresh multi-file image is fsck_hfs-clean and mounts with all files intact (diffed catalog records against a Tiger-created reference to pin the root cause).
  • New tests: Phase A/B/C grow tests, the empty-tree/header fixes, and test_hfs_file_record_first_alloc_block_fields_are_zero. cargo test --lib green (2140), build --all-targets zero warnings, clippy -D warnings clean.

🤖 Generated with Claude Code

danifunker and others added 3 commits June 29, 2026 12:39
…OS fsck_hfs

Implements the deferred §4b of docs/hfsplus_btree_growth_plan.md: the catalog,
extents-overflow, and attributes B-trees now grow their backing fork when they
run out of nodes (grow_btree_fork) instead of returning DiskFull, so an
under-sized or foreign catalog fills in place. Phase A is contiguous tail
growth; Phase B spills the 9th+ extent into the extents-overflow B-tree via a
new overflow-aware write_fork_data_with_overflow; Phase C appends map nodes past
the (node_size-256)*8 header-bitmap cap.

Doing this against real macOS fsck_hfs surfaced two foundational blank-builder
bugs that our own checker missed and that affected every HFS+ image we build
from scratch (blanks and the defrag/clone path used by restore + HFS-expand):

- Invalid BTH length: the B-tree header laid out BTHeaderRec as 128 bytes;
  Apple requires exactly 106 (record 1 at 120, node bitmap at 248). This also
  resolves the header-vs-map-node capacity discrepancy — the bitmap now
  addresses (node_size-256)*8 nodes, matching map_nodes_required.
- Empty-tree representation: an empty tree was written depth-1 with an empty
  root leaf (Apple uses depth-0 / no leaf) and the bitmap didn't reserve the
  trailing block holding the alternate volume header. btree_insert_full now
  bootstraps the root leaf on the first insert into a depth-0 tree, and returns
  a clean DiskFull (pre-split free_nodes >= depth+1 check) so the grow-retry
  can't half-apply a split and duplicate a record.

Adds rb-cli new --fs hfsplus [--case-sensitive] [--min-catalog BYTES] (Phase 0)
for GUI/CLI parity and to drive the macOS validation recipe. Every grow path
plus the defrag-clone output is fsck_hfs "appears to be OK" and mountable.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
build_file_record (and the in-place write_resource_fork update) wrote the fork's
first allocation block into filStBlk (record offset 24) and filRStBlk (offset
34). Those are the File Manager's in-memory "first allocation block" cache and
must be 0 on disk — the real fork location lives in the extent record at offset
74 (data) / 86 (resource). A non-zero on-disk value makes Apple fsck_hfs report
"Reserved fields in the catalog record have incorrect data"
(E_CatalogFlagsNotZero); the check is
`if (file->dataStartBlock || file->rsrcStartBlock || file->reserved)`.

Every classic-HFS file rusty-backup wrote tripped this. Real Mac OS ignores the
on-disk value (it recomputes from the extents) so files still worked, and our
own fsck never checked it, so it stayed hidden. Confirmed on real Mac OS X 10.4
(Tiger), which fully supports HFS Standard: a fresh multi-file image is now
fsck_hfs "appears to be OK" and mounts with all files intact. The reader uses
the extent record, never filStBlk, so zeroing is safe. Covers HFV too.

Regression test: test_hfs_file_record_first_alloc_block_fields_are_zero.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Classic-Mac HFS/HFS+ allow '/' in catalog names (it's the Finder's display
of the ':' separator), but rb-cli split every in-image path on '/', so a
name like "Oxyd b/w" was unaddressable — the harvest pipeline had to skip
such files.

Add a shared escape/colon-aware tokenizer and route every literal walker
through it, supporting two grammars:

  - slash (all filesystems): '\/' = a literal slash in a component, '\\' =
    a literal backslash.
  - colon (HFS/HFS+ only, gated on a new Filesystem::uses_colon_paths()):
    ':' as the separator, so '/' is plain data — ':System Folder:Oxyd b/w'.
    A colon path is always literal (never globs).

split_image_path()/split_image_parent() in cli::parse are the single source
of truth, replacing ~6 hand-rolled split('/') sites across ls/get/get-binhex/
put/put-binhex/put-macbinary/mkdir/rm/locate/api-hfs. Remote rb:// addressing
stays slash-only. Help text + a "Path grammar" section in cli-reference.md
document both grammars; unit tests cover the tokenizer and a round-trip
against a real HFS volume holding 'Oxyd b/w'.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@danifunker danifunker merged commit 7502e2c into main Jun 29, 2026
14 checks passed
@danifunker danifunker deleted the hfsplus_tweaks branch June 29, 2026 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant