Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 106 additions & 0 deletions Payload_img_design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
# live-bootstrap

This repository uses [`README.rst`](./README.rst) as the canonical main documentation.

## Kernel-bootstrap raw `external.img`

`external.img` is a raw container disk used in kernel-bootstrap mode when
`--external-sources` is set and `--repo` is unset.

### Why not put everything in the initial image?

In kernel-bootstrap mode, the first boot image is consumed by very early
runtime code before the system reaches the normal bash-based build stage.
That early stage has tight assumptions about memory layout and file table usage.

When too many distfiles are packed into the initial image, those assumptions can
be exceeded, which leads to unstable handoff behavior (for example, failures
around the Fiwix transition in QEMU or on bare metal).

So the design is intentionally split:

- Initial image: only what is required to reach `improve: import_payload`
- `external.img`: the rest of distfiles

This is not a patch-style workaround. It is a two-phase transport design that
keeps early boot deterministic and moves bulk data import to a stage where the
runtime is robust enough to process it safely.

### Why import from an external image and copy into main filesystem?

Because the bootstrap still expects distfiles to end up under the normal local
path (`/external/distfiles`) for later steps. `external.img` is used as a
transport medium only.

The flow is:

1. Boot minimal initial image.
2. Reach `improve: import_payload`.
3. Detect the external container disk by magic (`LBPAYLD1`) across detected block devices.
4. Copy payload files into `/external/distfiles`.
5. Continue the build exactly as if files had been present locally all along.

### Format

- Magic: `LBPAYLD1` (8 bytes)
- Then: little-endian `u64` file count
- Repeated entries:
- little-endian `u64` name length
- little-endian `u64` file size
- file name string, encoded as UTF-8 bytes (no terminator)
- file bytes

`name length` is the number of bytes in the UTF-8 encoded file name (not the number of Unicode code points).

The importer probes detected block devices and selects the one with magic `LBPAYLD1`.

### Manual creation without Python

Prepare `external.list` as:

```text
<archive-name> <absolute-path-to-archive>
```

Then:

```sh
cat > make-payload.sh <<'SH'
#!/bin/sh
set -e
out="${1:-external.img}"
list="${2:-external.list}"

write_u64le() {
v="$1"
printf '%016x' "$v" | sed -E 's/(..)(..)(..)(..)(..)(..)(..)(..)/\8\7\6\5\4\3\2\1/' | xxd -r -p
}

count="$(wc -l < "${list}" | tr -d ' ')"
: > "${out}"
printf 'LBPAYLD1' >> "${out}"
write_u64le "${count}" >> "${out}"

while read -r name path; do
[ -n "${name}" ] || continue
size="$(wc -c < "${path}" | tr -d ' ')"
name_len="$(printf '%s' "${name}" | wc -c | tr -d ' ')"
write_u64le "${name_len}" >> "${out}"
write_u64le "${size}" >> "${out}"
printf '%s' "${name}" >> "${out}"
cat "${path}" >> "${out}"
done < "${list}"
SH
chmod +x make-payload.sh
./make-payload.sh external.img external.list
```

Attach `external.img` as an extra raw disk in QEMU, or as the second disk on bare metal.

### When it is used

- Used in kernel-bootstrap with `--external-sources` and without `--repo`.
- Not used with `--repo` (that path still uses an ext filesystem disk).
- Without `--external-sources` and without `--repo`, there is no second disk:
the initial image only includes distfiles needed before `improve: get_network`,
and later distfiles are downloaded from mirrors.
82 changes: 75 additions & 7 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -63,17 +63,85 @@ Without using Python:

* *Only* copy distfiles listed in ``sources`` files for ``build:`` steps
manifested before ``improve: get_network`` into this disk.
* Optionally (if you don't do this, distfiles will be network downloaded):

* On the second image, create an MSDOS partition table and one ext3
partition.
* Copy ``distfiles/`` into this disk.
* Run QEMU, with 4+G RAM, optionally SMP (multicore), both drives (in the
order introduced above), a NIC with model E1000
* In kernel-bootstrap mode with ``--external-sources`` (and no ``--repo``),
use the second image as ``external.img``.
``external.img`` is a raw container (not a filesystem) used to carry the
distfiles that are not needed before ``improve: import_payload``.
In other words, the first image only carries the minimal set needed to
reach the importer; the rest of the distfiles live in ``external.img``.

* Header magic: ``LBPAYLD1`` (8 bytes).
* Then: little-endian ``u64`` file count.
* Repeated for each file: little-endian ``u64`` name length,
little-endian ``u64`` file size, UTF-8 encoded file name bytes
(no terminator), raw file bytes.
* ``name length`` is the number of UTF-8 bytes (not Unicode code points).

* With ``--repo``, the second disk remains an ext3 distfiles/repo disk.
* Without ``--external-sources`` and without ``--repo``, no second disk is
used: the initial image includes only pre-network distfiles, and later
distfiles are downloaded from configured mirrors after networking starts.
* Run QEMU, with 4+G RAM, optionally SMP (multicore), both drives (main
builder image plus external image, when a second image is used), a NIC with model E1000
(``-nic user,model=e1000``), and ``-machine kernel-irqchip=split``.
c. **Bare metal:** Follow the same steps as QEMU, but the disks need to be
two different *physical* disks, and boot from the first disk.

Manual raw ``external.img`` preparation
---------------------------------------

The following script creates a raw ``external.img`` from a manually prepared
file list. This is equivalent to what ``rootfs.py`` does for kernel-bootstrap
with ``--external-sources`` (and no ``--repo``).

1. Prepare an ``external.list`` with one file per line, formatted as:
``<archive-name> <absolute-path-to-archive>``.
2. Run:

::

cat > make-payload.sh <<'EOF'
#!/bin/sh
set -e
out="${1:-external.img}"
list="${2:-external.list}"

write_u64le() {
v="$1"
printf '%016x' "$v" | sed -E 's/(..)(..)(..)(..)(..)(..)(..)(..)/\8\7\6\5\4\3\2\1/' | xxd -r -p
}

count="$(wc -l < "${list}" | tr -d ' ')"
: > "${out}"
printf 'LBPAYLD1' >> "${out}"
write_u64le "${count}" >> "${out}"

while read -r name path; do
[ -n "${name}" ] || continue
size="$(wc -c < "${path}" | tr -d ' ')"
name_len="$(printf '%s' "${name}" | wc -c | tr -d ' ')"
write_u64le "${name_len}" >> "${out}"
write_u64le "${size}" >> "${out}"
printf '%s' "${name}" >> "${out}"
cat "${path}" >> "${out}"
done < "${list}"
EOF
chmod +x make-payload.sh
./make-payload.sh external.img external.list

3. Attach ``external.img`` as an additional raw disk when booting in QEMU, or
as the second physical disk on bare metal.

Notes:

* ``external.img`` raw container mode is used with ``--external-sources`` (and
no ``--repo``).
* Without ``--external-sources`` and without ``--repo``, there is no second
image. The initial image only includes distfiles needed before
``improve: get_network``; later distfiles are downloaded from mirrors.
* The runtime importer identifies the correct disk by checking the magic
``LBPAYLD1`` on each detected block device, not by assuming a device name.

Mirrors
-------

Expand Down
Loading
Loading