feat(optical): remote optical-disc ripping (desktop-driven, device-streamed) by danifunker · Pull Request #47 · danifunker/rusty-backup

danifunker · 2026-06-27T12:55:36Z

This branch carries two independent bodies of work that grew up together: the remote optical ripping feature it started as, and a substantial HFS / HFS+ catalog B-tree scaling + correctness effort that landed on top while validating against real Apple disks. Both are summarized below.

1. Remote optical-disc ripping (desktop-driven, device-streamed)

Drive a remote optical drive from the desktop app / CLI: the machine with the CD/DVD drive runs rb-cli serve and only issues SCSI reads, while the desktop pulls raw sectors over the LAN and does all the encoding (ISO / BIN-CUE assembly + CHD compression). A weak device (e.g. a MiSTer SuperStation One, ~800 MHz Cortex-A9) never gets taxed by compression.

Full design + a phase-by-phase tracker live in docs/remote_ripping.md (all [x] except the two hardware/GUI-runtime checks).

How it works

The rip pipeline touches the drive through one tiny seam — OpticalSource (read_toc / read_data_sectors / eject) — so swapping a LocalCdReader for a RemoteCdReader moves the reader over the wire while every encode step stays on the desktop, unchanged.

machine with the drive (rb-cli serve)     your desktop (GUI / rb-cli)
─────────────────────────────────────     ──────────────────────────
cd-da-reader: READ TOC / READ CD     ──►   RemoteCdReader (proxies the 3 ops)
retry/backoff loop (next to drive)         write local .bin/.cue / .iso
ship raw 2352-byte sectors           ──►   CHD compress (libchdman)  ← stays here

Usage

On the box with the drive: rb-cli serve (needs optical+remote — the desktop release and rb-cli-mini both have them; the daemon must run elevated to open /dev/sr0).

From the desktop — CLI:

rb-cli optical drives --remote that-box:7341          # discover (prints rb:// args)
rb-cli optical rip --device rb://that-box:7341/dev/sr0 --output disc.cue --format bincue

From the desktop — GUI: Optical tab → Add remote daemon… (with an MRU quick-pick of past daemons) → the drive appears in the unified pulldown tagged [host:port] … → Start Rip.

What's inside

Daemon optical tier (CAP_FAMILY_O): ListOpticalDrives / OpenOptical / ReadToc / ReadOpticalSectors / EjectOptical / CloseOptical, reusing the existing control-frame + FileBegin/chunk framing. One session per daemon (cd-da-reader keeps a global handle), guarded process-wide.
RemoteCdReader proxying the seam; OpticalTarget { Local | Remote } dispatch; CLI rb:// device parse.
Unified picker (model::optical_devices) merging local + per-daemon drives; CLI optical drives --remote; GUI unified pulldown + worker-thread "Add remote daemon…" + MRU persisted to config.json.
Transfer speed + ETA in the CLI rip/convert progress line (e.g. 45% (315.0 MiB/700.0 MiB) - 28.4 MiB/s, ETA 13s). RateTracker lifted out of gui/progress.rs into model::rate_tracker so the CLI / mini build can use it; the GUI re-exports it unchanged. For a remote rip the rate is LAN throughput; for CHD it's the local encode rate.
No new container/file types; CHD encode reuses the existing path-based to_chd (temp BIN/CUE materialized locally, exactly like the existing rip-to-CHD flow).

Tests

17 lib optical/picker/protocol unit tests + 3 loopback integration tests (tests/remote_optical.rs: handshake/ListOpticalDrives round-trip, OpenOptical error + busy-guard release, remote enumeration) + the MRU unit test + RateTracker unit tests. Builds clean across default(GUI) / optical+remote / optical-only / remote-only.

Not yet validated (needs hardware / a GUI run)

A real disc rip over the network — byte-good ISO/CHD with the device CPU idle (the loopback tests cover the wire path, not a physical drive).
The GUI at runtime — compile- and clippy-clean here, but egui wasn't driven.

Companion change already on main: opticaldiscs 0.4.5 + the cd-da-reader armv7 fix that made the MiSTer optical build possible (PR #46).

2. HFS / HFS+ catalog B-tree scaling + correctness

A chain of HFS/HFS+ fixes surfaced while validating rusty-backup against real Apple-formatted disks (a MacPack 1.8 GB HFS volume and a Mac OS 9.2.2 HFS+ install). Both now fsck with 0 errors, and large imports no longer corrupt or stall. Plan + tracker: docs/hfsplus_btree_growth_plan.md (P1–P5 complete) and the deferred-step writeup docs/todo_hfsplus_fork_growth.md.

Classic HFS catalog B-tree (the reported corruption)

insert_catalog_record now uses the shared incremental inserter instead of rebuilding the whole index after every leaf split. The per-split rebuild was O(n²) and leaked index nodes living past the header-bitmap window, exhausting free nodes mid-rebuild and corrupting the tree at ~7.4k records (disk full: no free B-tree nodes / IndexSiblingLinkBroken).
split_index_node maintains fLink/bLink sibling links; rebuild_index_nodes (delete / fsck-repair) frees index nodes across all bitmap segments, not just the header window.

B*-tree density (matching how a real Mac packs)

btree_try_rotate_leaf — before allocating on a full leaf, redistribute records with an adjacent sibling that shares the parent, patching the one separator key in place. Lifts random-insert occupancy ~69% → ~88%. Reused for index nodes too.
btree_split_index_with_insert — append-aware index split (greedy pack-left on a tail append, rebalance only a genuine middle insert), replacing the fixed 50/50 split. Sequential index occupancy ~45% → ~96%.
Applies to the incremental per-put path as well as bulk import: 20,000 shuffled multi-dir files via individual rb-cli put calls land fsck-clean with no IndexSiblingLinkBroken (85% leaf / 82% index occupancy), matching the bulk untar path.

Import speed (untar of tens of thousands of files)

Duplicate check descends the index (was an O(n) leaf walk → O(n²) import); bulk mode skips the per-file full-catalog snapshot; tar_import caches each directory's child names; ensure_catalog_initialized stops re-reading the extents fork on every create. Net: a 9,000-file untar went 2m25s → 1.6s.

Mac-faithful collation + fsck (validated against real disks)

Classic HFS catalog keys use the real Mac Roman collation order (hfs_charorder, confirmed against Apple's _RelString and hfsutils), not an ASCII-uppercase table — fixes KeysOutOfOrder on accented / curly-quote / nbsp names.
HFS+ names use Apple's exact TN1150 case-fold + canonical decomposition tables (ported from the Linux kernel) for both comparison and on-disk form, replacing char::to_lowercase + Rust NFD. Fixes underscore-vs-letter ordering and matches Mac OS for ß, Hangul, and the decomposition-excluded ranges. Drops the now-unused unicode-normalization dependency.
Null bytes in catalog names are a warning (UnusualCatalogName), not an error: valid on classic HFS and present on real disks.

HFS+ B-tree growth (P1–P5 of the growth plan)

The classic helpers were hardwired for the classic-HFS key shape, so an HFS+ catalog couldn't grow past a single index level without corrupting. Fixed across five phases:

P1 — BTreeKeyFormat descriptor (big_keys / variable_index_keys / max_key_len) threaded through every split/grow/rotate helper, so HFS+ variable-length 2-byte keys are handled correctly; classic output byte-identical.
P2 — blank HFS+ catalog is now volume-scaled (~0.5% of volume, like classic HFS) instead of a fixed 4 nodes, so the live insert path no longer exhausts it after ~24 files. New create_blank_hfsplus_sized for clone targets/tests.
P3 — extents-overflow B-tree splits past depth-1 (was split-on-overflow not yet implemented); attributes tree splits too.
P4 — regression test proving the streamed defrag builder produces a multi-level catalog that's fsck-clean and round-trips byte-for-byte.
P5 — all three HFS+ live inserters delegate to the shared btree_insert_full (B*-rotation density), deleting ~250 lines of duplicated split/grow code; shuffled inserts pack ~0.69 → ~0.84 occupancy.
§4b grow-on-full is intentionally deferred (classic HFS ships without it too); design + a fsck_hfs-clean-on-real-Mac validation recipe captured in docs/todo_hfsplus_fork_growth.md.

Tests

Regression tests throughout: 20k+ record imports (bulk and per-put), bulk mode, random/sequential packing density, the exact real-disk collation cases, null-byte names, the TN1150 fold/decompose tables, buffer-level multi-level growth for catalog/extents/attributes trees, a 64-extent fragmented-file extents-overflow round-trip, and the multi-level defrag clone.

🤖 Generated with Claude Code

Kicks off the desktop-driven / device-streamed remote ripping feature (docs/remote_ripping.md): the MiSTer streams raw CD sectors and the desktop does all encoding, so a weak armv7 device isn't taxed by CHD compression. P1.1 — the read seam, a pure local refactor with no behavior change: - New src/optical/source.rs: `OpticalSource` trait (read_toc / read_data_sectors / eject) + `LocalCdReader` wrapping cd-da-reader. - rip_iso / rip_bin_cue take `&dyn OpticalSource`; `run_rip` builds the source via `open_optical_source` and ejects through the trait. eject_disc moved into source.rs. - RipConfig is unchanged (still device_path) so the GUI/CLI are untouched; the OpticalTarget switch lands in P1.7 with the remote dispatch. Optical unit tests green (14/14); behavior byte-identical. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Adds the daemon side of remote ripping: a desktop client can list a remote machine's optical drives, open one, read its TOC, stream raw sectors, and eject — while the daemon only issues SCSI reads. - protocol.rs (P1.2/P1.3): CAP_FAMILY_O = 1<<2; six optical Request verbs (ListOpticalDrives / OpenOptical / ReadToc / ReadOpticalSectors / EjectOptical / CloseOptical) + OpticalOpened / Toc / OpticalDrives responses; serde-mirror DTOs (WireToc/WireTrack/WireSectorMode/WireRetryConfig/ WireOpticalDrive) always compiled under `remote`, with cd-da-reader From conversions gated behind `optical`. Sector data reuses FileBegin + chunk stream. Round-trip + conversion tests. - server.rs (P1.4): `optical_server` module — a per-connection OpticalState wrapping a LocalCdReader (reuses the P1.1 read/eject ops) plus a process-global AtomicBool busy guard (cd-da-reader keeps a global drive handle, so only one session per daemon; released on drop / disconnect). Dispatch arms for all six verbs; non-optical builds reply cleanly. Hello advertises CAP_FAMILY_O. Builds clean in optical, remote-only, and default (GUI) configs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Completes the desktop side of remote ripping — `rb-cli optical rip --device rb://host:port/dev/sr0` now streams raw sectors from a remote daemon and does all encoding locally. - client.rs / connection.rs (P1.5): RemoteSession + RemoteConnection optical methods (list_optical_drives / open_optical / read_toc / read_optical_sectors / eject_optical / close_optical). They return the Wire DTOs, so no `optical` gate on the client. - source.rs (P1.6): RemoteCdReader (gated `remote`) implements OpticalSource over an Arc<Mutex<RemoteConnection>>; retry is sent at open, Drop frees the daemon's optical slot. - rip.rs (P1.7): OpticalTarget { Local | Remote{conn,device_path} } with a manual Debug + resolve() that parses an rb:// device arg; RipConfig.device replaces device_path; open_optical_source branches Local/Remote. - CLI + GUI updated to the new RipConfig.device (local call sites build OpticalTarget::Local); CLI --device help documents the rb:// form. Builds clean across optical+remote / optical-only / default(GUI); 14 optical unit tests + protocol round-trip tests green. Hardware rip validation (P1.8) pending a real drive on a networked box. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

tests/remote_optical.rs drives the client<->daemon optical path over a port-0 rb-cli serve listener without a physical drive: - the optical-built daemon handles ListOpticalDrives (round-trips, doesn't reply "built without the optical feature"); - OpenOptical of a bogus device errors cleanly and releases the process-global busy guard, so a second open fails at open rather than reporting "busy". This validates the wire path + the single-session guard. The byte-identical rip-a-real-disc validation still needs an optical drive on a networked box. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…3.3) - model/optical_devices.rs (P3.1): RipDevice / DeviceLocation (Local | Remote{conn,label}) + list_local_rip_devices / append_remote_rip_devices / list_rip_devices merging local + per-daemon drives. picker_label / cli_device_arg / into_target helpers. Remote enumeration errors are swallowed, so an offline or non-optical daemon contributes nothing — which also capability-gates the picker without inspecting handshake bits. - cli/verbs/optical.rs (P3.3): `optical drives --remote host:port` (repeatable) lists local + each daemon's drives, printing a feedable device arg (rb://host:port/dev/sr0 for remote rows). Routed through the picker core. - tests/remote_optical.rs: remote_rip_device_enumeration_over_loopback validates the remote arm; optical_devices unit test covers local label/arg/target. Builds clean across default(GUI) / optical+remote / optical-only; loopback + unit tests green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…2, P3.4) The Optical tab's drive picker now lists local AND remote drives in one pulldown, so a desktop user can rip from a networked daemon's drive (e.g. the SuperStation) with all encoding staying local. - optical_tab: unified `rip_devices: Vec<RipDevice>` (+ `remote_daemons`) replaces the local-only `Vec<OpticalDrive>`; the combo labels entries via picker_label (`[host:port] name (path)` for remote). An "Add remote daemon..." modal connects on a worker thread (ConnectStatus / poll_add_remote, so an unreachable host can't freeze egui) and unlocks the Physical-drive mode. - Rip dispatch goes through RipDevice::to_target(); start_rip_to_chd / rip_to_chd_worker now take an OpticalTarget (CHD encode still local). - Remote drives are rip-only here: get_browsable_path returns None for them (disc-info/browse open the device locally; remote browse is the Inspect tab). - model::optical_devices: add to_target(&self) (borrowing) + is_remote(). - P3.4: capability gating falls out of append_remote_rip_devices (a non-optical daemon contributes no drives); eject is location-aware via OpticalSource::eject (eject checkbox gained a hover note). Builds + clippy clean on default(GUI); GUI runtime behavior pending a user check. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Cross-cutting doc-sync for the remote-ripping feature (CLAUDE.md rule): - README MiSTer build list: "Remote ripping off-device" bullet (run rb-daemon on the device, drive it from the desktop; device only does SCSI reads). - docs/full_MiSTer_support_status.md: "Remote optical ripping" capability line. - docs/remote_ripping.md: P2.1 done (rip_to_chd_worker takes OpticalTarget, so remote -> CHD encodes locally), P2.2 ~ (hardware), done-criteria all checked. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The "Add remote daemon" dialog now remembers daemon addresses across sessions, so a recurring drive (e.g. the SuperStation) is one click away. - update.rs: UpdateConfig.recent_daemon_addrs (config.json) + remember_daemon() (dedup, newest-first, capped at 8). Unit-tested. - optical_tab: on a successful connect, record the address (in-memory MRU + persisted); the dialog shows a "Recent:" quick-pick list — clicking an entry re-connects. It's a pick list, not auto-reconnect, so an offline daemon never blocks startup. Builds + clippy clean (GUI + mini); MRU unit test green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The GUI progress bar already showed rate + ETA; the CLI only printed percent. Lift the rolling-window estimator into a shared, non-GUI module so both surfaces use one implementation, and wire it into the CLI. - model/rate_tracker.rs (new): RateTracker moved verbatim out of gui/progress.rs (record / rate_bytes_per_sec / eta_secs / suffix / reset) + its unit tests. Pure std, ungated — available to the CLI/mini build. - gui/progress.rs: re-exports RateTracker from the model layer (the backup / restore / inspect / export tabs that use `progress::RateTracker` are unchanged). - cli/verbs/optical.rs: drain_rip + drain_convert sample the tracker each tick and append " - <rate>/s, ETA <eta>" to the progress line, e.g. ` progress: 45% (315.0 MiB/700.0 MiB) - 28.4 MiB/s, ETA 13s`. For a remote rip the rate reflects LAN throughput; for CHD it's the local encode rate. Builds + clippy clean (GUI + mini); rate_tracker tests green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ation/fsck Resolves PROMPT-hfs-catalog-btree-scaling.md plus a chain of HFS/HFS+ correctness issues surfaced while validating rusty-backup against real Apple-formatted disks (a MacPack 1.8 GB HFS volume and a Mac OS 9.2.2 HFS+ install). Both now fsck with 0 errors. Catalog B-tree (the reported corruption): - insert_catalog_record uses the shared incremental inserter instead of rebuilding the whole index after every leaf split. The per-split rebuild was O(n^2) and leaked index nodes living past the 2048-node header bitmap, exhausting free nodes mid-rebuild and corrupting the tree at ~7.4k records ("disk full: no free B-tree nodes" / IndexSiblingLinkBroken). - split_index_node now maintains fLink/bLink sibling links. - rebuild_index_nodes (delete / fsck-repair) frees index nodes across all bitmap segments, not just the header window. Import speed (untar of tens of thousands of files): - duplicate check descends the index (was an O(n) leaf walk -> O(n^2) import) - bulk mode skips the per-file full-catalog snapshot - tar_import caches each directory's child names (was list_directory/entry) - ensure_catalog_initialized stops re-reading the extents fork every create Net: a 9,000-file untar went 2m25s -> 1.6s. B-tree leaf packing: - append-aware split: dense pack-left for sequential inserts, balanced for random ones (random imports 1.6 -> 2.6 records/node, matching Mac OS; the default 2 GB catalog now holds ~48k random-order files, was ~28k). Collation + fsck (validated against real disks): - classic HFS catalog keys use the real Mac Roman collation order (hfs_charorder; confirmed against Apple OS/HFS/CMMAINT.a _RelString and hfsutils/Linux), not an ASCII-only uppercase table -- fixes KeysOutOfOrder on accented / curly-quote / nbsp names. - HFS+ names use Apple's exact TN1150 case-fold + canonical decomposition tables (ported from the Linux kernel) for both comparison and the on-disk form, replacing char::to_lowercase + Rust NFD. Fixes underscore-vs-letter ordering on real HFS+ volumes and matches Mac OS for ß (1:1 fold), Hangul, and the decomposition-excluded ranges. Drops the now-unused unicode-normalization dependency. - null bytes in catalog names are a warning (UnusualCatalogName), not an error: valid on classic HFS, and real disks carry them. Adds regression tests throughout (20k+ record imports, bulk mode, random packing density, the exact real-disk collation cases, null-byte names, and the TN1150 fold/decompose tables). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The bulk-import scaling fix (PROMPT-hfs-catalog-btree-scaling) left the incremental, one-record-at-a-time `put` path packing the catalog B-tree far more loosely than a sequential build, so a catalog grown by many separate `rb-cli put` calls (MacAtrium's per-title art build) ran out of catalog nodes ("disk full: no free B-tree nodes") at a fraction of the file count and left IndexSiblingLinkBroken near the ceiling. Root cause was two naive 50/50 splits, both of which freeze a node at ~half full when later inserts land elsewhere: - Leaf splits left non-sequential (mid-leaf) inserts at the classic ~69% B-tree occupancy. On a 100M volume shuffled `put` died at ~2384 records vs ~3281 for a sequential build. - The index split was append-blind: sequential separators (what every leaf split emits in key order) froze each non-rightmost index node at ~45%, doubling index nodes and adding a whole tree level — this hurt *every* workload, sequential included. Fix, mirroring how a real Mac packs a B*-tree: - `btree_try_rotate_leaf` — before allocating a node on a full leaf, redistribute records with an adjacent sibling that shares the parent and has room, updating the one parent separator key in place. Lifts random-insert occupancy ~69%→~88%. Reused for index nodes too (record-agnostic). - `btree_split_index_with_insert` — append-aware index split (greedy pack-left on a tail append, rebalance only a genuine middle insert), replacing the fixed 50/50 `split_index_node`. Sequential index occupancy ~45%→~96%. The rotation only patches a separator in place when the existing key is the same length as the normalized classic-HFS key; for a real variable-length HFS+ index it bails to the normal split, so HFS+ behaviour is unchanged. Results (100M volume, was/now): flat shuffled put 2384→3107, nested-dir longer names 1893→2378, sequential 3281→3713 — all fsck-clean. End-to-end: 20,000 shuffled multi-dir files via individual `rb-cli put` calls land fsck-clean with no IndexSiblingLinkBroken (85% leaf / 82% index occupancy), matching the bulk `untar` path. Adds a 20k shuffled multi-dir regression test and tightens the random-insert density assertion. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…x keys (P1) The shared HFS B-tree split/grow/rotate helpers in hfs_common.rs were hardwired for the classic-HFS key shape: a 1-byte key-length prefix and the fixed 0x25 normalized index separator. HFS+ reuses those helpers verbatim, so the moment an HFS+ catalog leaf split and needed an index separator, the helper read the high byte of the 2-byte key length as the whole length and ran the key through the classic normalize -- producing a malformed 38-byte separator. Descent then misrouted (parentID read at the wrong offset), records landed in the wrong leaf, and fsck reported "leaves must be strictly ascending". An HFS+ catalog therefore could not grow past a single index level without corrupting. Introduce BTreeKeyFormat -- a small Copy descriptor (big_keys, variable_index_keys, max_key_len) derived from the BTHeaderRec attributes + maxKeyLength -- and thread &BTreeKeyFormat through btree_split_leaf_with_insert, btree_split_index_with_insert, btree_insert_into_index, btree_grow_root, btree_try_rotate_leaf, btree_update_index_separator, and btree_insert_full. Separator extraction now reads the key portion via kf.key_portion(); index records are built via kf.make_index_key(), which keeps the classic fixed-0x25 key for CLASSIC_CATALOG (byte-identical to the old normalize_catalog_index_key path) and stores the child's variable-length 2-byte key verbatim for the HFS+ catalog/attributes trees. Callers pass the matching constant: classic HFS -> CLASSIC_CATALOG, the HFS+ catalog insert -> HFSPLUS_CATALOG, the attributes insert -> HFSPLUS_ATTRIBUTES, and the streamed defrag builders -> the same per-tree constants. The B*-rotation stays effectively classic-only on HFS+ for now: btree_update_index_separator only patches a separator in place when the new key matches the old length, otherwise it abandons the rotation and splits -- so a variable-key index is never left with a stale separator (full density rotation is P5). This is the §4a "key-format descriptor" step of docs/hfsplus_btree_growth_plan.md. Verified at the catalog-buffer level: btree_insert_full with HFSPLUS_CATALOG grows a shuffled multi-parent catalog to depth >= 3 with strictly-ascending leaves and every record findable by descent. The volume-level fsck gate lands in P2, once blank catalogs are sized to hold enough records to reach depth >= 3 (a blank catalog is only 4 nodes today). Classic HFS output is unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…sands of records (P2 §4c) A blank HFS+ volume reserved a fixed 4-node catalog B-tree regardless of volume size, so the live insert path exhausted it after ~24 files ("disk full: no free B-tree nodes") -- the second defect in the growth plan's probe. Classic HFS avoids this by pre-sizing the catalog to ~0.5% of the volume (default_btree_sizes / create_blank_hfs_sized) and has no grow-on-full path at all; mirror that. build_blank_hfsplus_front now sizes the catalog from the volume via default_hfsplus_catalog_bytes (~0.5%), clamped to whole nodes in [4, header-bitmap capacity] so the blank still needs no dedicated map nodes (~30,544 nodes / 117 MiB at node_size 4096). The extents-overflow tree keeps its 4-node default (its own scaling is P3). create_blank_hfsplus is unchanged in signature (auto-sizes); a new create_blank_hfsplus_sized lets clone targets and tests pin a larger catalog into a modest image, and the streamed write_blank_hfsplus_into auto-sizes too. Verified end-to-end: 20k files in shuffled key order across 50 directories insert into a 64 MiB volume with a 16 MiB catalog, the catalog grows to depth >= 3, and hfsplus_fsck reports zero errors -- the volume-level gate P1 deferred, now passing because the variable-length index keys (P1) and the sized catalog (this commit) work together. Updated test_create_blank_hfsplus_32mib for the new (larger, volume-scaled) reserved-block layout. 4b (grow the fork when free_nodes hits 0) is deferred: classic HFS ships without it, the blank auto-sizes, and the clone path over-sizes its target, so live growth is only needed for a foreign under-sized catalog -- a pre-existing classic-HFS limitation. Rationale recorded in docs/hfsplus_btree_growth_plan.md §4b. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… key-format descriptor (P3) insert_extents_overflow_record was pinned to a single leaf level: on a full leaf it returned InvalidData ("split-on-overflow not yet implemented") because the shared B-tree growth helpers used to force the classic 1-byte / fixed-0x25 index key shape onto HFS+ records. P1's BTreeKeyFormat fixed that, so wire the extents path through the full split/grow machinery (btree_split_leaf_with_insert / btree_grow_root / btree_insert_into_index) with BTreeKeyFormat::HFSPLUS_EXTENTS -- 2-byte big keys with fixed 10-byte index separators -- mirroring insert_catalog_record / insert_xattr_record. The attributes path already routed through HFSPLUS_ATTRIBUTES in P1, so it splits too. Tests: - buffer-level: the extents-overflow tree (fixed index keys) and the attributes tree (variable index keys) each grow to depth >= 2 under shuffled inserts, with strictly-ascending leaves and every record findable by root-to-leaf descent. - real-path integration: a 520-block maximally-fragmented file generates 64 extents-overflow records, splitting that B-tree past one leaf via the real insert_extents_overflow_record, and reads back byte-for-byte through the resulting multi-level tree. §4a/P3 of docs/hfsplus_btree_growth_plan.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…(P4) The streamed defrag builder constructs its target catalog with hfs_common::btree_insert_full + BTreeKeyFormat::HFSPLUS_CATALOG (wired in P1), so it inherited the variable-length index-key fix automatically. Add a round-trip test that proves it end-to-end: a 64 MiB source with 300 files across 10 dirs has a multi-level catalog (depth >= 2), and after stream_defragmented_hfsplus the target's defrag-built catalog is itself multi-level, fsck-clean, and round-trips byte-for-byte. Before P1 a defrag-built catalog that exceeded one leaf level would have carried the same malformed classic-shaped separators as the live path. §P4 of docs/hfsplus_btree_growth_plan.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…tation density (P5) The HFS+ live insert methods (insert_catalog_record, insert_xattr_record, insert_extents_overflow_record) each carried a hand-copied find -> insert -> split -> grow dance that, unlike classic HFS's insert_catalog_record, never tried a sibling rotation before splitting. A plain split leaves a randomly-inserted (per-put) leaf at the classic ~69% B-tree occupancy, so a catalog grown one record at a time used ~1.4x the leaves of a packed tree. Delegate all three to the shared hfs_common::btree_insert_full (threading the matching BTreeKeyFormat) -- the exact path classic HFS already uses, which attempts a B*-style rotation into an adjacent sibling sharing the leaf's parent before allocating a new node. btree_try_rotate_leaf's separator update is already key-format-aware (P1): it patches the parent separator in place when the new and old separators are the same length -- always true for the fixed-length extents keys and for same-length catalog/attribute names -- and otherwise abandons the rotation and splits, so a variable-key index is never left with a stale separator. Result: shuffled multi-dir catalog inserts now pack to ~0.84 leaf occupancy (was ~0.69), measured by test_hfsplus_catalog_shuffled_inserts_pack_densely. The change also deletes ~250 lines of duplicated split/grow code, leaving one tested insert implementation shared by classic HFS, HFS+, and the defrag builder. §4d/P5 of docs/hfsplus_btree_growth_plan.md. With this, P1-P5 are complete; only the optional §4b grow-on-full (deferred, matching classic HFS) remains. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

All five phases are implemented and tested; only the optional §4b grow-on-full is intentionally deferred (classic HFS has no grow path either). Flip the top status line from "in progress" to complete. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ation recipe The one deferred step of the HFS+ B-tree growth plan: growing a catalog / extents-overflow / attributes fork when it runs out of nodes (today btree_alloc_node just returns DiskFull). Because this writes new on-disk-mutation machinery in the node-bitmap / map-node area that historically produced IndexSiblingLinkBroken corruption, its acceptance bar is fsck_hfs-clean and mountable on a real Mac (HFS+/HFSX is still fully supported there) -- not just our own hfsplus_fsck. The doc covers: the single allocation chokepoint and why growth must live in the HfsPlusFilesystem insert methods; a phased design (Phase A contiguous tail growth within the header-bitmap node cap -- the high-value 90% case; Phase B >8-extent overflow spill + write-path overflow; Phase C map-node appending past the cap); risks/gotchas (write-path overflow, the two conflicting bitmap-capacity formulas, journaling, atomicity, clump alignment); in-repo tests; and a concrete macOS validation recipe. It also flags the real CLI gaps -- there is no `new --fs hfsplus` and `put` copies a host file (not stdin) -- and adds a Phase 0 prerequisite to expose HFS+ creation with a --min-catalog knob so the grow path can be driven from rb-cli for the Mac recipe. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

danifunker and others added 19 commits June 27, 2026 07:19

docs(hfsplus): cross-link the deferred §4b note to the fork-growth plan

558f17c

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

danifunker merged commit e7f0424 into main Jun 29, 2026
14 checks passed

danifunker deleted the remote-optical-ripping branch June 29, 2026 02:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(optical): remote optical-disc ripping (desktop-driven, device-streamed)#47

feat(optical): remote optical-disc ripping (desktop-driven, device-streamed)#47
danifunker merged 19 commits into
mainfrom
remote-optical-ripping

danifunker commented Jun 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

danifunker commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. Remote optical-disc ripping (desktop-driven, device-streamed)

How it works

Usage

What's inside

Tests

Not yet validated (needs hardware / a GUI run)

2. HFS / HFS+ catalog B-tree scaling + correctness

Classic HFS catalog B-tree (the reported corruption)

B*-tree density (matching how a real Mac packs)

Import speed (untar of tens of thousands of files)

Mac-faithful collation + fsck (validated against real disks)

HFS+ B-tree growth (P1–P5 of the growth plan)

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

danifunker commented Jun 27, 2026 •

edited

Loading