Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,17 @@
/libgap*.dll*
/libgap*.so*

# emscripten / wasm build outputs (etc/emscripten/build.sh and friends)
/native-build/
/extern/emscripten/
/web-example/
/packages.tar.gz
/gap.html
/gap.js
/gap.wasm
/gap.worker.js
/gap-fs.json

/bin/gap*.sh
/bin/*-*/
# Compiling the `xgap` package creates the file `xgap.sh`
Expand Down
36 changes: 36 additions & 0 deletions etc/emscripten/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Build environment for compiling GAP to WebAssembly via Emscripten.
#
# Pinned to emsdk 3.1.23: GAP relies on subtle memory/stack behaviour
# (notably the ASYNCIFY interaction with GASMAN) that has broken on newer
# emscripten releases.

FROM emscripten/emsdk:3.1.23

RUN apt-get update && apt-get install -y --no-install-recommends \
autoconf \
automake \
libtool \
make \
python3 \
ca-certificates \
curl \
bison \
byacc \
m4 \
&& rm -rf /var/lib/apt/lists/*

# Allow running as a non-root host UID without breaking the emscripten cache.
RUN mkdir -p /emsdk/upstream/emscripten/cache \
&& chmod -R 0777 /emsdk/upstream/emscripten/cache

# Cache the GAP package distribution tarball so container runs don't
# re-download it from GitHub each time. The URL pins to the "latest"
# release on the PackageDistro side, so this snapshot drifts as upstream
# tags new releases — rebuild the image (--no-cache) to refresh.
ARG GAP_PACKAGES_URL=https://github.com/gap-system/PackageDistro/releases/download/latest/packages.tar.gz
RUN curl --fail --location --silent --show-error \
--output /opt/gap-packages.tar.gz \
"$GAP_PACKAGES_URL" \
&& chmod 0644 /opt/gap-packages.tar.gz

WORKDIR /gap
122 changes: 109 additions & 13 deletions etc/emscripten/README.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,122 @@
Code to allow building gap to WASM using Emscripten.
# GAP in the browser

Files:
Build GAP as a WebAssembly module and serve it as a self-contained website.
The terminal interface uses [xterm-pty](https://github.com/mame/xterm-pty),
so the resulting page behaves like a normal GAP REPL.

- `build.sh`: Run as `etc/emscripten/build.sh` from a fresh copy of GAP.
## Quick start

- `web-template`: Uses 'xterm-pty' to create a "nice" interface to the Wasm GAP.
From a fresh GAP checkout, with either Docker or Podman installed:

- `build_startup_manifest.js`: Run it in the web root directory to build `startup_manifest.json` that contains resources to preload.
```sh
etc/emscripten/build-in-docker.sh
cd web-example
../etc/emscripten/serve.py
```

Then open <http://localhost:8080/>. The first build takes 10–30 minutes; the
docker image, the GAP package distribution, and the GMP/zlib builds are all
cached for subsequent runs.

To pick up newer GAP packages from upstream, force a fresh image build:

```sh
docker build --no-cache -t gap-emscripten-build:3.1.23 etc/emscripten/
```

On Apple Silicon (and other non-amd64 hosts), the build runs `linux/amd64`
under emulation, since `emscripten/emsdk:3.1.23` is amd64-only on Docker
Hub. `build-in-docker.sh` pins the platform explicitly so the layer cache
holds across runs.

The output directory `web-example/` is fully self-contained — copy it to any
static host (see "Hosting" below for the headers it needs).

## Building without Docker

If you already have emsdk 3.1.23 sourced in your shell, you can run the
underlying build directly:

```sh
etc/emscripten/build.sh
etc/emscripten/assemble-website.sh
```

emsdk 3.1.23 is the only version we test against. GAP relies on subtle
ASYNCIFY/GASMAN interactions that have broken on newer emsdk releases.

See 'run-web-demo.sh' as an example on how to set up a working website.
## Hosting

Note that this demo uses xterm-pty, a library which provides a terminal interface
for emscripten-compiled programs. This uses a javascript feature called
"SharedArrayBuffer", which requires some headers are returned by the server:
The xterm-pty terminal uses `SharedArrayBuffer`, which browsers only allow
when the page is served with these two headers:

```
Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corp
```

For more details, see for [this article](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/SharedArrayBuffer).
`serve.py` is a 20-line stdlib-only Python server that adds them. For
GitHub Pages and other static hosts that don't let you set headers,
`web-template/coi-serviceworker.js` is included as a workaround (it
re-fetches resources through a service worker that adds the headers).

## Files

| File | Role |
| ---- | ---- |
| `build-in-docker.sh` | One-stop entry point. Builds the image, runs `build.sh` inside, then `assemble-website.sh`. |
| `Dockerfile` | Pinned `emscripten/emsdk:3.1.23` with autotools, python3, bison/byacc/m4, and a baked-in copy of the GAP package distribution tarball at `/opt/gap-packages.tar.gz`. |
| `build.sh` | Configures and builds GMP, zlib, and GAP itself for wasm. |
| `assemble-website.sh` | Copies the build outputs and data directories (`pkg`, `lib`, `grp`, …) into `web-example/`. |
| `generate_gap_fs_json.py` | Reads file paths on stdin, writes `gap-fs.json` (the manifest of every file in the virtual FS). |
| `startup_manifest.json` | List of files to fetch eagerly at startup, captured from a real GAP run. Anything not in this list is fetched lazily on first read. See "Updating the startup manifest" below for how to refresh it. |
| `serve.py` | Local server that adds the COOP/COEP headers. |
| `web-template/` | Static UI: `index.html`, the worker scripts, the FS init shim, and the COOP/COEP service worker for hosts where you can't set headers. |

## Updating the startup manifest

`startup_manifest.json` lists files (relative to the GAP root) that the FS
init shim downloads up front instead of lazily. The current list was
captured from a real GAP run reaching its prompt, so it includes both
the core library bootstrap (`lib/init.g`, `lib/read*.g`, …) and any
default-loaded packages. Entries that no longer exist in the build are
silently ignored, so it is safe to leave stale entries in place; it is
also safe to leave the list empty (every file becomes lazy).

The manifest is a startup-time optimisation, not a correctness mechanism:
a wrong list never breaks the build, it only makes startup slower (files
GAP needs but the manifest omits get fetched lazily, one round-trip each)
or wastes bandwidth (files in the manifest that GAP doesn't actually
read are downloaded anyway). So it's worth refreshing when something
changes the set of files read at startup — most importantly when the
default loaded packages change, but also after large library reshuffles.

To regenerate it after such changes:

1. Build the website (`build-in-docker.sh`) and serve it (`serve.py`).
2. **Empty the served manifest before capturing.** Replace
`web-example/startup_manifest.json` with `[]` (or delete it).
Otherwise the existing entries are eagerly pre-fetched at startup,
appear in `fetchedUrls`, and you'll just round-trip the old list.
Editing the served file is enough — no rebuild is needed.
3. Open the page and wait for the GAP prompt to appear. Every file that
GAP actually reads now goes through the lazy `XHR` path and gets
captured.
4. Open devtools and read the captured URLs from the page's JS console:
`window.fetchedUrls` is an array of every unique URL the worker
requested. Chrome/Firefox provide a `copy()` console helper:
`copy(JSON.stringify(fetchedUrls))` puts the JSON on your clipboard.
5. Strip non-GAP-FS entries (`gap.js`, `gap.wasm`, `gap-fs.json`, the
xterm CDN URLs) and write the result to
`etc/emscripten/startup_manifest.json` (so it's checked in and gets
picked up by the next `assemble-website.sh`). A `jq` filter that
keeps just GAP filesystem paths:

```sh
jq '[.[] | select(test("^(pkg|lib|grp|tst|doc|hpcgap|dev|benchmark)/"))]' \
fetched-urls.json > etc/emscripten/startup_manifest.json
```

The file "coi-serviceworker.js" works around this problem on Github pages. This won't
work locally, so "server.rb" is a simple ruby script, which just starts a web-server
which returns the required headers.
The bookkeeping lives in `web-template/gap-fs.js` (wraps `fetch` and
`XMLHttpRequest.open` to report URLs to the main thread) and
`web-template/index.html` (accumulates them onto `window.fetchedUrls`).
46 changes: 46 additions & 0 deletions etc/emscripten/assemble-website.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
#!/usr/bin/env bash
#
# Assemble a self-contained website from a completed wasm build.
# Outputs ./web-example/ relative to the GAP source root.
#
# Run after etc/emscripten/build.sh (or have build-in-docker.sh call it).

set -euo pipefail

SCRIPT_DIR=$(cd "$(dirname "$0")" && pwd)
ROOT_DIR=$(cd "$SCRIPT_DIR/../.." && pwd)
OUT_DIR="$ROOT_DIR/web-example"

cd "$ROOT_DIR"

for f in gap.js gap.wasm gap.worker.js gap-fs.json; do
if [[ ! -f $f ]]; then
echo "Error: missing build output '$f'. Run etc/emscripten/build.sh first." >&2
exit 1
fi
done

# Always start from a clean output directory. Merging into a previous
# web-example/ confuses cp when a tree has changed shape between runs
# (e.g. pkg/X switching between a symlink and a real directory).
rm -rf "$OUT_DIR"
mkdir -p "$OUT_DIR"

cp "$SCRIPT_DIR"/web-template/* "$OUT_DIR"/
cp "$SCRIPT_DIR"/startup_manifest.json "$OUT_DIR"/
cp gap.js gap.wasm gap.worker.js gap-fs.json "$OUT_DIR"/
cp LICENSE COPYRIGHT "$OUT_DIR"/

# Data directories. These are referenced by gap-fs.json and either eagerly
# loaded (if listed in startup_manifest.json) or lazily fetched on first
# read by Emscripten's createLazyFile.
#
# -L follows symlinks so that user setups where pkg/X is a symlink into
# a separate git/X checkout produce a self-contained output tree (the
# user often wants to copy web-example/ to another machine or static
# host where those symlinks would dangle).
for d in pkg lib grp tst doc hpcgap dev benchmark; do
cp -RL "$d" "$OUT_DIR"/
done

echo "Assembled website at $OUT_DIR"
80 changes: 80 additions & 0 deletions etc/emscripten/build-in-docker.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
#!/usr/bin/env bash
#
# One-stop shop: build GAP for the web inside a pinned container.
#
# Usage (run from the GAP source tree):
# etc/emscripten/build-in-docker.sh
#
# Output: ./web-example/ with a self-contained website. Copy it anywhere
# and serve with COOP/COEP headers (etc/emscripten/serve.py is one option).

set -euo pipefail

SCRIPT_DIR=$(cd "$(dirname "$0")" && pwd)
ROOT_DIR=$(cd "$SCRIPT_DIR/../.." && pwd)

RUNTIME="${CONTAINER_RUNTIME:-}"
if [[ -z "$RUNTIME" ]]; then
if command -v podman >/dev/null 2>&1; then
RUNTIME=podman
elif command -v docker >/dev/null 2>&1; then
RUNTIME=docker
else
echo "Error: neither podman nor docker found in PATH." >&2
echo "Install one, or set CONTAINER_RUNTIME explicitly." >&2
exit 1
fi
fi

IMAGE_TAG="gap-emscripten-build:3.1.23"

# Pin the platform. emscripten/emsdk:3.1.23 is amd64-only on Docker Hub,
# so on Apple Silicon (or any non-amd64 host) the runtime would otherwise
# renegotiate the platform on every build — that mismatch invalidates the
# FROM layer's cache and cascades through the whole image, defeating the
# layer cache even with --layers.
PLATFORM="linux/amd64"

echo ">> Using container runtime: $RUNTIME (platform: $PLATFORM)"
echo ">> Building image $IMAGE_TAG (cached after first run)"

# --layers is the podman/buildah flag for "use the layer cache"; older
# podman versions default it to false. Docker has caching on by default
# and rejects the flag, so only pass it for podman.
declare -a BUILD_ARGS=(--platform "$PLATFORM")
if [[ "$RUNTIME" != "docker" ]]; then
BUILD_ARGS+=(--layers)
fi
"$RUNTIME" build "${BUILD_ARGS[@]}" -t "$IMAGE_TAG" -f "$SCRIPT_DIR/Dockerfile" "$SCRIPT_DIR"

# Run as the host user where possible so build outputs are not root-owned.
declare -a USER_ARGS=()
if [[ "$RUNTIME" == "docker" ]]; then
USER_ARGS=(--user "$(id -u):$(id -g)" -e HOME=/tmp)
else
# Rootless podman maps host UID to container root by default.
USER_ARGS=(--userns=keep-id)
fi

echo ">> Building GAP inside container"
"$RUNTIME" run --platform "$PLATFORM" --rm \
-v "$ROOT_DIR:/gap" \
-w /gap \
"${USER_ARGS[@]}" \
"$IMAGE_TAG" \
bash etc/emscripten/build.sh

echo ">> Assembling website"
bash "$SCRIPT_DIR/assemble-website.sh"

cat <<EOF

Build complete.
Website: $ROOT_DIR/web-example/
Serve: cd web-example && ../etc/emscripten/serve.py
Browse: http://localhost:8080/

You can copy web-example/ to any static host that returns the headers
Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corp
EOF
Loading