diff --git a/.dockerignore b/.dockerignore new file mode 100644 index 0000000..9064e5a --- /dev/null +++ b/.dockerignore @@ -0,0 +1,12 @@ +# Host build outputs — never send macOS/other host artifacts into Linux builds. +/build/ +build/ +**/build/ +.cache/ +**/.cache/ +.e2e-work/ +examples/.e2e-work/ + +# Editor / OS noise +**/.DS_Store +**/*~ diff --git a/.github/workflows/headscale-e2e.yml b/.github/workflows/headscale-e2e.yml index 78bed2b..f308c7f 100644 --- a/.github/workflows/headscale-e2e.yml +++ b/.github/workflows/headscale-e2e.yml @@ -1,4 +1,4 @@ -# QuackTail e2e over Headscale — one job, Headscale service + concurrent DuckDB workers. +# QuackTail e2e over Headscale — manual only; uses GitHub release binaries (no source build). name: Headscale QuackTail e2e on: @@ -22,7 +22,7 @@ env: jobs: quacktail-e2e: - name: QuackTail e2e (Headscale + server + client) + name: QuackTail e2e (release binary + Headscale) runs-on: ubuntu-latest timeout-minutes: 30 permissions: diff --git a/.github/workflows/headscale-integration.yml b/.github/workflows/headscale-integration.yml index e426e66..aa3c7c3 100644 --- a/.github/workflows/headscale-integration.yml +++ b/.github/workflows/headscale-integration.yml @@ -12,6 +12,8 @@ on: - 'src/**' - 'cmake/**' - 'third_party/libtailscale/**' + - 'examples/Dockerfile' + - 'scripts/e2e/**' env: HEADSCALE_HOST: headscale @@ -38,7 +40,7 @@ jobs: - name: Install build dependencies run: | sudo apt-get update - sudo apt-get install -y build-essential cmake ninja-build ccache curl + sudo apt-get install -y build-essential cmake ninja-build patch ccache curl - name: Start Headscale run: | diff --git a/CMakeLists.txt b/CMakeLists.txt index 8de0b79..cadc899 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -14,7 +14,7 @@ set(CMAKE_CXX_STANDARD_REQUIRED ON) include_directories(src/include) -set(EXTENSION_SOURCES src/quackscale_extension.cpp src/tailscale_bridge.cpp src/tailscale_forwarder.cpp src/tailscale_log_capture.cpp) +set(EXTENSION_SOURCES src/quackscale_extension.cpp src/attach_ducklake.cpp src/tailscale_bridge.cpp src/tailscale_forwarder.cpp src/tailscale_log_capture.cpp) if(QUACKSCALE_WITH_TAILSCALE) include(${CMAKE_CURRENT_SOURCE_DIR}/cmake/Libtailscale.cmake) diff --git a/README.md b/README.md index f65d3b8..20f5506 100644 --- a/README.md +++ b/README.md @@ -1,77 +1,125 @@ # QuackScale -DuckDB community extension that joins a [Tailscale](https://tailscale.com) tailnet and exposes the [Quack](https://duckdb.org/docs/current/quack/overview) remote protocol on tailnet addresses — so DuckDB peers can `ATTACH` and query each other over easily and securely. +**QuackScale** embeds a [Tailscale](https://tailscale.com) / WireGuard client ([libtailscale](https://github.com/tailscale/libtailscale)) inside DuckDB so a process can join a private tailnet and reach peers over encrypted mesh networking — without a separate VPN sidecar, tunnel daemon, or public ingress. -**QuackTail** = DuckDB + `quack` (core) + `quackscale` (this extension) on the same tailnet. - -QuackScale does **not** replace the core `quack` extension. Load both: +Combined with DuckDB’s [Quack](https://duckdb.org/docs/current/quack/overview) HTTP protocol, you get **QuackTail**: SQL engines that discover each other on `100.x.x.x` / MagicDNS, authenticate callers, and run `ATTACH`, `quack_query`, and DuckLake workloads across the mesh. ```sql -LOAD quack; -- HTTP server, quack_serve, ATTACH quack:... -LOAD quackscale; -- tailscale_up, quack_uri, quack_token, ... +LOAD quack; -- HTTP server, ATTACH, quack_query +LOAD quackscale; -- tailnet join, dial, forward, serve — all from SQL ``` -## Documentation +QuackScale does **not** replace `quack` or `ducklake`. It provides the **network layer** Quack needs on a tailnet. -| Doc | Contents | -|-----|----------| -| [docs/PLAN.md](docs/PLAN.md) | Architecture, roadmap, risks | -| [docs/AUTHENTICATION.md](docs/AUTHENTICATION.md) | **Tailscale** — auth keys, browser login, `TS_AUTHKEY` | -| [docs/HEADSCALE.md](docs/HEADSCALE.md) | **Headscale** — self-hosted control plane (`control_url`, preauth keys) | -| [docs/QUACK_AUTH.md](docs/QUACK_AUTH.md) | **Quack** — shared tokens, env provisioning, overriding `quack_authentication_function` | +| Goal | Start here | +|------|------------| +| Design a deployment (patterns, DuckLake, demos) | [docs/GUIDE.md](docs/GUIDE.md) | +| Tailnet login, Headscale, Quack tokens | [docs/AUTHENTICATION.md](docs/AUTHENTICATION.md) | +| Two-node proof (Docker Compose) | [examples/README.md](examples/README.md) | +| Build from source, CI, roadmap | [docs/DEVELOPMENT.md](docs/DEVELOPMENT.md) | +| Full doc index | [docs/README.md](docs/README.md) | -## Authentication (two layers) +--- -QuackTail uses **two separate** credential systems. Both are required in production unless you deliberately relax Quack auth on a locked-down tailnet. +## Why embedded Tailscale in DuckDB? -| Layer | Question | Provisioned via | QuackScale / Quack | -|-------|----------|-----------------|-------------------| -| **Tailscale** | Is this process on our tailnet? | `TS_AUTHKEY`, `CALL tailscale_up`, or browser login | `tailscale_*` SQL — [AUTHENTICATION.md](docs/AUTHENTICATION.md) | -| **Headscale** (optional) | Same, self-hosted control server | `control_url` + Headscale preauth key | Same SQL — [HEADSCALE.md](docs/HEADSCALE.md) | -| **Quack** | May this caller run SQL on this server? | Shared env token, DuckDB secrets, or custom auth macro | `quack_token()`, `quack_serve(token => ...)`, `CREATE SECRET` — see [QUACK_AUTH.md](docs/QUACK_AUTH.md) | +Traditional setups expose DuckDB/Quack on localhost or bind a public IP and add TLS, firewalls, and VPN appliances around it. QuackScale flips that model: **each DuckDB process carries its own tailnet identity** and speaks WireGuard to peers that your control plane already trusts. -**Do not** copy the random `auth_token` column from each `CALL quack_serve` by hand. For a fleet of servers and clients, use a **network-wide shared token** (or allowlist) as described in [Quack security — Overriding authentication](https://duckdb.org/docs/current/quack/security#overriding-authentication). +| Benefit | What it means for you | +|---------|------------------------| +| **WireGuard encryption** | Traffic between tailnet nodes is encrypted end-to-end ([Noise](https://tailscale.com/blog/how-tailscale-works) / WireGuard). Quack HTTP rides inside that mesh — not cleartext on the public internet. | +| **No public listen by default** | Nodes get tailnet IPs (`100.64.0.0/10`). Quack binds loopback; `tailscale_serve_local` exposes **9494** only on the mesh. Nothing needs a world-routable address. | +| **Identity-based access** | Tailscale or [Headscale](https://github.com/juanfont/headscale) ACLs decide **which nodes** may open TCP to a peer. Quack tokens decide **which callers** may run SQL — [defense in depth](docs/AUTHENTICATION.md). | +| **No sidecar VPN** | libtailscale (tsnet) runs in-process. One binary, one lifecycle — ideal for containers, batch jobs, and edge nodes that should not run `tailscaled` separately. | +| **NAT traversal** | Mesh connectivity works across NATs and regions (direct paths or DERP relays). DuckDB nodes on laptops, cloud VMs, and on-prem can mesh without manual port forwarding. | +| **Self-hosted or SaaS control plane** | Same SQL API for [Tailscale](https://tailscale.com) and [Headscale](https://headscale.net/) — set `control_url` and a preauth key. | +| **Manage the tailnet from SQL** | Join, status, ping, forward, serve, and teardown are **`CALL` table functions** — scriptable in migrations, init SQL, and orchestration hooks. | -```sh -# Same value on every QuackTail server and client (K8s secret, systemd, etc.) -export QUACK_TAILNET_TOKEN='your-shared-secret-at-least-4-chars' -export TS_AUTHKEY='tskey-auth-...' # Tailscale — separate secret -``` +QuackScale handles **reachability and transport**. You still configure [Quack application auth](docs/AUTHENTICATION.md) (`QUACK_TAILNET_TOKEN`, secrets, allowlists) for who may execute SQL. -## Prerequisites +--- -- C++17 toolchain, `cmake`, `make` (or `ninja` + `ccache`) -- **Go 1.25+** with CGO (for libtailscale; CMake bootstraps Go 1.25.5 automatically if the host toolchain is older) -- DuckDB with core **`quack`** extension (e.g. v1.5.3+) -- Git submodules: `duckdb`, `extension-ci-tools`, `third_party/libtailscale` +## How QuackTail fits together -```sh -git clone --recurse-submodules https://github.com/quackscience/duckdb_tailscale.git -cd duckdb_tailscale -git submodule update --init --recursive # if you cloned without --recurse-submodules +```text + Server (long-lived) Client (job / laptop) + ─────────────────── ───────────────────── + CALL tailscale_up(...) CALL tailscale_up(...) + CALL quack_serve(127.0.0.1:9494) CALL tailscale_quack_forward(host => …) + CALL tailscale_serve_local(:9494) │ + │ ▼ + │ WireGuard mesh quack:127.0.0.1:19494 + └◄──────── tailscale_dial ────────────┘ + ATTACH / quack_query / attach_ducklake ``` -## Build +**`tailscale_quack_forward`** is required when the client uses embedded tsnet: Quack speaks normal HTTP/TCP, which kernel routing does not send over the tailnet by itself. The forwarder listens on loopback and dials peers via `tailscale_dial`. -```sh -make -# faster rebuilds: GEN=ninja make -``` +End-to-end recipes and DuckLake patterns: **[docs/GUIDE.md](docs/GUIDE.md)**. + +--- + +## SQL API (`LOAD quackscale`) + +Use **`CALL`** for table functions (same style as `CALL quack_serve`). Parameters for `tailscale_up` / `tailscale_login`: `hostname`, `authkey` (or `TS_AUTHKEY` env), `control_url`, `state_dir`, `ephemeral`, `loopback_proxy`. + +### Tailnet lifecycle + +| Command | Purpose | +|---------|---------| +| [`CALL tailscale_up(...)`](docs/AUTHENTICATION.md#tailnet-login-tailscale-saas) | Join the tailnet (blocking). Server automation and CI. | +| [`CALL tailscale_login(...)`](docs/AUTHENTICATION.md#developer-laptop) | Non-blocking join; returns `login_url` for browser auth. | +| [`CALL tailscale_login_status()`](docs/AUTHENTICATION.md#developer-laptop) | Poll login state (`starting` / `needs_login` / `up` / `error`). | +| [`CALL tailscale_status()`](docs/GUIDE.md#observability) | Linked?, running, hostname, tailnet IPs. | +| [`CALL tailscale_down()`](docs/GUIDE.md#standard-client-connection-recipe) | Stop forwarder and close tsnet. **Required** for one-shot clients or the process hangs. | + +### Connectivity on the mesh -Artifacts: +| Command | Purpose | +|---------|---------| +| [`CALL tailscale_serve_local(port => 9494)`](docs/GUIDE.md#use-case-1--remote-duckdb-hub-pattern-a) | Tailscale Serve: tailnet TCP **→** `127.0.0.1:9494`. Run after local `quack_serve`. | +| [`CALL tailscale_ping(host => 'peer', port => 9494)`](docs/GUIDE.md#observability) | TCP dial to a peer over tsnet — readiness before Quack `ATTACH`. | +| [`CALL tailscale_quack_forward(host => 'peer', port => 9494)`](docs/GUIDE.md#standard-client-connection-recipe) | Listen on loopback; dial peer for each Quack HTTP connection. Returns `quack_uri`. **Preferred client path.** | +| [`CALL tailscale_quack_proxy()`](docs/DEVELOPMENT.md) | Legacy SOCKS proxy + `ALL_PROXY` — deprecated; use `tailscale_quack_forward`. | +| [`CALL tailscale_proxy_status()`](docs/DEVELOPMENT.md) | Legacy SOCKS status. | -- `./build/release/duckdb` — shell with extension preloaded -- `./build/release/extension/quackscale/quackscale.duckdb_extension` — loadable binary +### Quack on tailnet (helpers; `LOAD quack` required for serve/attach) -Disable Tailscale embedding (stub build, no Go): +| Function | Purpose | +|----------|---------| +| `quack_uri()` | This node’s client-facing `quack::9494` (MagicDNS or tailnet IP). | +| `quack_token()` | Shared Quack secret from `QUACK_TAILNET_TOKEN` / `QUACK_TOKEN` env. | +| [`CALL quack_discover(port => 9494)`](docs/GUIDE.md#finding-peers) | All `quack:` URIs this node advertises on the tailnet. | + +Core Quack (`LOAD quack`): `quack_serve`, `quack_stop`, `ATTACH`, `quack_query`, etc. + +### Remote DuckLake + +| Command | Purpose | +|---------|---------| +| [`CALL attach_ducklake(uri, ...)`](docs/GUIDE.md#use-case-2--ducklake-on-the-server-patterns-b--b) | Local views over a remote DuckLake catalog when Parquet lives on the server. | + +--- + +## Authentication (two layers) + +| Layer | Question | Details | +|-------|----------|---------| +| **Tailnet** | Is this machine on our mesh? | [docs/AUTHENTICATION.md — Tailnet login](docs/AUTHENTICATION.md#tailnet-login-tailscale-saas) | +| **Quack** | May this caller run SQL? | [docs/AUTHENTICATION.md — Quack tokens](docs/AUTHENTICATION.md#quack-http-tokens) | ```sh -make CMAKE_VARS="-DQUACKSCALE_WITH_TAILSCALE=OFF" +export TS_AUTHKEY='tskey-auth-...' # or Headscale preauth key +export QUACK_TAILNET_TOKEN='shared-quack-secret' # same on servers and clients ``` -## Quick start — QuackTail server +Do **not** copy the random `auth_token` from each `CALL quack_serve`. Use a fleet-wide shared token or [Quack allowlist](https://duckdb.org/docs/current/quack/security#overriding-authentication). + +--- + +## Quick start -Set env vars **before** starting DuckDB (see [authentication](#authentication-two-layers)): +### Server ```sh export TS_AUTHKEY='tskey-auth-...' @@ -83,13 +131,11 @@ export QUACK_TAILNET_TOKEN='your-shared-quack-token' LOAD quack; LOAD quackscale; --- 1) Join tailnet CALL tailscale_up( hostname => 'my-duckdb-node', state_dir => '~/.local/share/duckdb/quackscale' ); --- 2) Quack on loopback; Tailscale Serve exposes port 9494 on the tailnet CALL quack_serve( 'quack:127.0.0.1:9494', allow_other_hostname => true, @@ -97,97 +143,55 @@ CALL quack_serve( ); CALL tailscale_serve_local(port => 9494); --- 3) See what clients should connect to -CALL quack_discover(); +FROM quack_discover(); ``` -For **local-only** (no tailnet), the [Quack docs](https://duckdb.org/docs/current/quack/overview) use `CALL quack_serve('quack:localhost', token => ...)` and `ATTACH 'quack:localhost' AS remote (TYPE quack)` with `SCOPE 'quack:localhost'` — plain HTTP is automatic for local URIs. +Long-lived servers: persistent `state_dir`, **no** `tailscale_down()`. Headscale: add `control_url` and preauth key — [docs/AUTHENTICATION.md](docs/AUTHENTICATION.md). -## Quick start — QuackTail client - -Same `QUACK_TAILNET_TOKEN` on the client machine: +### Client ```sql +LOAD quackscale; LOAD quack; +CALL tailscale_up(hostname => 'my-client', state_dir => '…', …); +CALL tailscale_quack_forward(host => 'my-duckdb-node', port => 9494, local_port => 19494); + CREATE SECRET ( TYPE quack, TOKEN 'your-shared-quack-token', - SCOPE 'quack:my-duckdb-node:9494' -); - -ATTACH 'quack:my-duckdb-node:9494' AS remote ( - TYPE quack, - DISABLE_SSL true + SCOPE 'quack:127.0.0.1:19494' ); +ATTACH 'quack:127.0.0.1:19494' AS remote (TYPE quack, DISABLE_SSL true); FROM remote.query('SELECT 42'); -``` - -Use the hostname from `tailscale_up(hostname => ...)` and Quack’s default port **9494**. Details and multi-token setups: [docs/QUACK_AUTH.md](docs/QUACK_AUTH.md). - -### Tailscale login (first-time / laptop) - -| Scenario | Command | -|----------|---------| -| Server / automation | `export TS_AUTHKEY=...` then `CALL tailscale_up(...)` | -| Interactive browser | `CALL tailscale_login(...)` → open `login_url` → `CALL tailscale_login_status()` until `status = 'up'` | -| Repeat visits | Reuse `state_dir` — usually no browser | - -See [docs/AUTHENTICATION.md](docs/AUTHENTICATION.md). - -### Headscale (self-hosted tailnet) - -[Headscale](https://github.com/juanfont/headscale) is API-compatible with Tailscale’s control server — no extra QuackScale APIs: -```sql -CALL tailscale_up( - hostname => 'my-duckdb-node', - control_url => 'https://headscale.example.com', - authkey => '', - state_dir => '~/.local/share/duckdb/quackscale' -); +DETACH remote; +CALL tailscale_down(); ``` -Example: [examples/headscale_quacktail.sql](examples/headscale_quacktail.sql). CI runs [`.github/workflows/headscale-integration.yml`](.github/workflows/headscale-integration.yml). +Full client recipe (probe, DuckLake, compose markers): **[docs/GUIDE.md](docs/GUIDE.md)**. -### Quack auth modes (pick one) +--- -| Mode | When | How | -|------|------|-----| -| **Shared env token** | Default for QuackTail fleets | `QUACK_TAILNET_TOKEN` + `quack_token()` on serve; matching `CREATE SECRET` or `TOKEN` on clients | -| **Multi-token allowlist** | Teams, rotation, multiple clients | `SET GLOBAL quack_authentication_function = '...'` + token table — [Quack docs](https://duckdb.org/docs/current/quack/security#example-multi-token-table) | -| **Developer mode** | Lab tailnet only | Auth macro always `true` — [Quack docs](https://duckdb.org/docs/current/quack/security#example-developer-mode-always-allow) | - -Full walkthrough: [docs/QUACK_AUTH.md](docs/QUACK_AUTH.md). +## Build -## SQL reference +**Prerequisites:** C++17, cmake, make or ninja, Go 1.25+ (CGO; CMake bootstraps Go 1.25.5 if needed), DuckDB with core **`quack`**, git submodules (`duckdb`, `extension-ci-tools`, `third_party/libtailscale`). -Load with `LOAD quackscale;`. Use **`CALL`** for table functions (same style as `CALL quack_serve`), not `SELECT` / `FROM`. +```sh +git clone --recurse-submodules https://github.com/quackscience/duckdb-quackscale.git +cd duckdb-quackscale +GEN=ninja make release +``` -### Tailscale (`quackscale` extension) +- `./build/release/duckdb` — shell with extension +- `./build/release/extension/quackscale/quackscale.duckdb_extension` — loadable binary -| Command | Description | -|---------|-------------| -| `CALL tailscale_up(...)` | Join tailnet; params: `hostname`, `state_dir`, `control_url`, `ephemeral`, `authkey` / `TS_AUTHKEY` | -| `CALL tailscale_login(...)` | Non-blocking join; returns `login_url` for browser auth | -| `CALL tailscale_login_status()` | Poll login (`starting` / `needs_login` / `up` / `error`) | -| `CALL tailscale_status()` | libtailscale linked?, running, hostname, tailnet IPs | -| `CALL tailscale_quack_forward(host => 'peer', port => 9494)` | Localhost TCP → `tailscale_dial` (preferred for Quack ATTACH; no ALL_PROXY) | -| `CALL tailscale_quack_proxy()` | Legacy SOCKS + ALL_PROXY | -| `CALL tailscale_proxy_status()` | Legacy SOCKS status | +Stub build without Tailscale: `make CMAKE_VARS="-DQUACKSCALE_WITH_TAILSCALE=OFF"`. -### Quack on tailnet (helpers; requires core `quack` for `quack_serve`) +Docker images (source build + verify): **[examples/README.md](examples/README.md)**. -| Function | Description | -|----------|-------------| -| `quack_uri()` | Client-facing `quack::9494` for discovery/ATTACH | -| `CALL tailscale_serve_local(port => 9494)` | Tailscale Serve: tailnet TCP → `127.0.0.1:9494` (run after local `quack_serve`) | -| `CALL tailscale_ping(host => 'peer', port => 9494)` | tsnet TCP dial to peer (readiness check before Quack ATTACH) | -| `quack_token()` | Shared Quack token from `QUACK_TAILNET_TOKEN` / `QUACK_TOKEN` env | -| `CALL quack_discover()` | All `quack:` URIs this node advertises (`magicdns` / `tailnet_ip`) | - -Core Quack (`LOAD quack`): `quack_serve`, `quack_stop`, `ATTACH`, `quack_query`, etc. +--- ## Tests @@ -195,20 +199,12 @@ Core Quack (`LOAD quack`): `quack_serve`, `quack_stop`, `ATTACH`, `quack_query`, make test ``` -SQL unit tests do not require a live tailnet or `QUACK_TAILNET_TOKEN`. See [test/README.md](test/README.md). - -### Integration (Headscale + QuackTail) - -- **Docker Compose demo:** [examples/README.md](examples/README.md) — two-node cluster with `tailscale_quack_forward` + Quack `ATTACH` -- **CI e2e:** [`.github/workflows/headscale-e2e.yml`](.github/workflows/headscale-e2e.yml) (`workflow_dispatch`, release binary `v1.0.2` by default) -- **Host helper:** `scripts/local_remote_headscale_test.sh` — join a running compose stack from host DuckDB +Unit tests need no live tailnet. **E2e (manual):** [`.github/workflows/headscale-e2e.yml`](.github/workflows/headscale-e2e.yml) — release binary, `workflow_dispatch` only. **Local full demo:** [examples/README.md](examples/README.md). **PR smoke:** [headscale-integration.yml](.github/workflows/headscale-integration.yml). -Details: [test/e2e/README.md](test/e2e/README.md). - -## Based on - -[duckdb/extension-template](https://github.com/duckdb/extension-template) +--- ## License MIT (extension template). libtailscale is [BSD-3-Clause](https://github.com/tailscale/libtailscale/blob/main/LICENSE). + +Based on [duckdb/extension-template](https://github.com/duckdb/extension-template). diff --git a/docs/AUTHENTICATION.md b/docs/AUTHENTICATION.md index 75bcdd1..7c55716 100644 --- a/docs/AUTHENTICATION.md +++ b/docs/AUTHENTICATION.md @@ -1,163 +1,225 @@ -# Tailscale authentication (QuackScale) +# Authentication -This document covers **only Tailscale** — getting a DuckDB process onto your tailnet. +QuackTail uses **two independent credential layers**. Both matter in production unless you deliberately relax Quack auth on a locked-down tailnet. -For **Quack HTTP tokens** (shared secrets between QuackTail servers and clients), see **[QUACK_AUTH.md](QUACK_AUTH.md)**. You need both layers in production. +| Layer | Question | Configure with | +|-------|----------|----------------| +| **Tailnet** | Is this process on our mesh? | `TS_AUTHKEY`, Headscale preauth key, or browser login → `CALL tailscale_up` | +| **Quack** | May this caller run SQL over HTTP? | `QUACK_TAILNET_TOKEN`, `CREATE SECRET`, or custom auth macro | -| Doc | Topic | -|-----|--------| -| [QUACK_AUTH.md](QUACK_AUTH.md) | Shared `QUACK_TAILNET_TOKEN`, `quack_token()`, `CREATE SECRET`, overriding `quack_authentication_function` | -| [PLAN.md](PLAN.md) | Architecture and roadmap | -| [../README.md](../README.md) | Quick start and SQL reference | - -## How it fits QuackTail - -``` - Client Server - │ │ - │ ① Tailscale (wire) │ CALL tailscale_up - │ TS_AUTHKEY / login │ → node on tailnet - │ │ - │ ② Quack HTTP :9494 │ CALL quack_serve(..., token => quack_token()) - │ QUACK_TAILNET_TOKEN │ → SQL API on tailnet IP - └─────────────────────────────────────────┘ -``` - -Tailscale ACLs control **who can open TCP to port 9494**. Quack tokens control **who may run SQL** once connected. See [Quack security](https://duckdb.org/docs/current/quack/security). +Tailnet ACLs control **who can open TCP to port 9494**. Quack tokens control **who may execute SQL** once connected. See [Quack security](https://duckdb.org/docs/current/quack/security). --- -QuackScale embeds [libtailscale](https://github.com/tailscale/libtailscale) (Go **tsnet**). Joining a tailnet matches other embedded Tailscale apps: **auth keys**, **environment variables**, **persisted state**, or **interactive browser login**. +## Tailnet login (Tailscale SaaS) -## How tsnet authenticates +QuackScale embeds [libtailscale](https://github.com/tailscale/libtailscale) (tsnet). Joining matches other embedded Tailscale apps. | Mode | How | Best for | |------|-----|----------| | **Auth key** | `authkey` in `CALL tailscale_up`, or `TS_AUTHKEY` env | Servers, CI, automation | -| **Persisted state** | `state_dir` — keys on disk after first login | Laptops, repeat use | -| **Interactive login** | Login URL in logs; open in browser | First-time dev setup | -| **Headscale** | `control_url` → your [Headscale](https://github.com/juanfont/headscale) URL + Headscale preauth key | Self-hosted tailnet (Tailscale-compatible) | -| **Test control** | `control_url` → [tstestcontrol](https://github.com/tailscale/libtailscale/tree/main/tstestcontrol) | libtailscale unit tests | +| **Persisted state** | `state_dir` on disk after first login | Laptops, repeat use | +| **Browser login** | `CALL tailscale_login` → open `login_url` | First-time dev setup | + +### Production server + +```sh +export TS_AUTHKEY='tskey-auth-...' +``` -The libtailscale C API exposes `tailscale_set_authkey`, `tailscale_set_dir`, `tailscale_set_control_url`, `tailscale_set_logfd`, and `tailscale_up`. There is no C API that returns a login URL directly — tsnet prints `https://login.tailscale.com/a/…` on the **log stream** (see [libtailscale Python README](https://github.com/tailscale/libtailscale/blob/main/python/README.md)). +```sql +LOAD quackscale; -Reference: [tsnet.Server · Tailscale Docs](https://tailscale.com/kb/1522/tsnet-server). +CALL tailscale_up( + hostname => 'analytics-hub', + state_dir => '/var/lib/duckdb/tailscale' +); +``` -## Loopback forward (Quack HTTP over the tailnet) +Do not commit auth keys in SQL — use env or your secret store. -Embedded tsnet can dial peers (`tailscale_ping`), but **Quack uses normal HTTP/TCP**. Kernel sockets cannot reach tailnet IPs without help. +### Developer laptop -The native libtailscale path ([tsnetctest](https://github.com/tailscale/libtailscale/blob/main/tsnetctest/tsnetctest.go)) uses `tailscale_dial`. QuackScale exposes that for Quack via a **localhost TCP forwarder** — no SOCKS, no `ALL_PROXY`: +`CALL tailscale_up()` **blocks** until login completes. For a non-blocking flow: ```sql -CALL tailscale_up(hostname => 'my-client', authkey => '...', state_dir => '/var/lib/duckdb/ts'); -CALL tailscale_quack_forward(host => 'peer-hostname', port => 9494, local_port => 19494); --- quack_uri => quack:127.0.0.1:19494 - -CREATE SECRET (TYPE quack, TOKEN '...', SCOPE 'quack:127.0.0.1:19494'); -ATTACH 'quack:127.0.0.1:19494' AS remote (TYPE quack, DISABLE_SSL true); +CALL tailscale_login( + hostname => 'my-laptop', + state_dir => '~/.local/share/duckdb/quackscale' +); +CALL tailscale_login_status(); -- poll until status = 'up' ``` -`tailscale_quack_forward` listens on `127.0.0.1:local_port` and dials `host:port` over tsnet for each Quack HTTP connection. +Open `login_url` in a browser. Reuse `state_dir` on later runs. + +### Environment variables (tailnet) + +| Variable | Effect | +|----------|--------| +| `TS_AUTHKEY` | Auth key if not passed in `CALL tailscale_up` | +| `TSNET_FORCE_LOGIN` | Force browser login even when an auth key is set (rare) | -Legacy: `CALL tailscale_quack_proxy()` (SOCKS + `ALL_PROXY`) remains but is deprecated. +--- -## Recommended patterns +## Headscale (self-hosted control plane) -### Production / servers — auth key +[Headscale](https://github.com/juanfont/headscale) implements the Tailscale control server API. QuackScale uses the same parameters as `tailscale up --login-server`: -Create a [reusable or ephemeral auth key](https://tailscale.com/kb/1085/auth-keys), then: +| Tailscale CLI | QuackScale | +|---------------|------------| +| `--login-server https://hs.example.com` | `control_url => 'https://hs.example.com'` | +| `--authkey …` | `authkey => '…'` or `TS_AUTHKEY` | +| `--hostname` | `hostname => '…'` | +| state directory | `state_dir => '…'` | + +Create Headscale preauth keys with `headscale preauthkeys create` (not the Tailscale admin UI). ```sh -export TS_AUTHKEY='tskey-auth-...' +headscale users create quackscale +headscale preauthkeys create --user 1 --reusable --expiration 168h ``` ```sql -LOAD quackscale; - CALL tailscale_up( - hostname => 'analytics-duck-1', - state_dir => '/var/lib/duckdb/tailscale' + hostname => 'duckdb-node-a', + control_url => 'https://headscale.example.com', + authkey => '', + state_dir => '/var/lib/duckdb/headscale-state' ); ``` -Or pass the key in SQL: `CALL tailscale_up(authkey => 'tskey-auth-...', ...)`. +**Compose demo:** control URL `http://headscale:8080`, preauth key written to `/work/authkey`. See [examples/README.md](../examples/README.md). -Do not commit auth keys in SQL files — use env or your orchestrator’s secret store. +**Notes:** Production `server_url` should be HTTPS. MagicDNS is optional; `quack_uri()` prefers MagicDNS when available, else tailnet IP. -### Developer laptop — browser login +--- -`CALL tailscale_up()` **blocks** until login completes. For a non-blocking flow: +## Quack HTTP tokens + +After a node is on the tailnet, Quack still requires application-level auth. + +### Default Quack behavior (why you override it) + +`CALL quack_serve(...)` generates a **random** token unless you pass `token => '...'`. That is fine for local experiments; **fleets need a shared token or allowlist**. + +QuackScale provides `quack_token()` to read a shared secret from the environment on the **server**. Clients use the same value via `CREATE SECRET` or `TOKEN`. + +### Environment variables (Quack) + +Set on **both** servers and clients: + +| Variable | Role | +|----------|------| +| `QUACK_TAILNET_TOKEN` | **Preferred** — shared token (≥ 4 characters) | +| `QUACK_TOKEN` | Fallback if `QUACK_TAILNET_TOKEN` is unset | + +Keep **`TS_AUTHKEY`** separate from Quack tokens. + +--- + +## Quack auth modes + +### Mode 1 — Single shared token (recommended) + +**Server:** ```sql +LOAD quack; LOAD quackscale; -CALL tailscale_login( - hostname => 'my-laptop-duckdb', - state_dir => '~/.local/share/duckdb/quackscale' +CALL tailscale_up(hostname => 'warehouse-a', state_dir => '…'); + +CALL quack_serve( + 'quack:127.0.0.1:9494', + allow_other_hostname => true, + token => quack_token() ); --- Returns status, login_url, message +CALL tailscale_serve_local(port => 9494); +``` -CALL tailscale_login_status(); -- poll until status = 'up' +**Client** (after `tailscale_quack_forward` — see [GUIDE.md](GUIDE.md)): + +```sql +LOAD quack; + +CREATE SECRET ( + TYPE quack, + TOKEN 'your-shared-quack-secret', + SCOPE 'quack:127.0.0.1:19494' +); + +ATTACH 'quack:127.0.0.1:19494' AS remote (TYPE quack, DISABLE_SSL true); ``` -Open `login_url` in a browser and approve the device. tsnet may also print the same URL on DuckDB stderr. +`SCOPE` must match how the client reaches the server. With the forwarder, that is `quack:127.0.0.1:`. + +**Stateless queries:** -After the first login, reuse `state_dir`; later `CALL tailscale_up()` usually needs no browser. +```sql +FROM quack_query( + 'quack:127.0.0.1:19494', + 'SELECT 42', + token => 'your-shared-quack-secret', + disable_ssl => true +); +``` -### Self-hosted — Headscale +### Mode 2 — Token allowlist (rotation / teams) -[Headscale](https://github.com/juanfont/headscale) implements the Tailscale control server API. QuackScale uses the same knobs as the Tailscale CLI: +Use Quack’s [multi-token table](https://duckdb.org/docs/current/quack/security#example-multi-token-table): ```sql -CALL tailscale_up( - hostname => 'my-node', - control_url => 'https://headscale.example.com', - authkey => '', - state_dir => '/var/lib/duckdb/headscale-state' +CREATE TABLE quacktail_tokens (auth_token VARCHAR PRIMARY KEY, label VARCHAR); +INSERT INTO quacktail_tokens VALUES ('primary-2026', 'analytics'); + +CREATE MACRO quacktail_check_token(sid, client_token, server_token) AS ( + EXISTS (SELECT 1 FROM quacktail_tokens WHERE auth_token = client_token) ); +SET GLOBAL quack_authentication_function = 'quacktail_check_token'; ``` -Create keys with `headscale preauthkeys create`. Full walkthrough: **[HEADSCALE.md](HEADSCALE.md)** and [examples/headscale_quacktail.sql](../examples/headscale_quacktail.sql). +Validate **`client_token`** (what the caller sent), not `server_token`. -### CI / tests +### Mode 3 — Developer mode (lab only) -| Workflow | Control plane | -|----------|----------------| -| [headscale-integration.yml](../.github/workflows/headscale-integration.yml) | Docker Headscale + `CALL tailscale_up` | -| [headscale-e2e.yml](../.github/workflows/headscale-e2e.yml) | Two-node QuackTail e2e (linux, manual dispatch) | -| [libtailscale-integration.yml](../.github/workflows/libtailscale-integration.yml) | libtailscale `tstestcontrol` (`go test`) | +```sql +CREATE MACRO quacktail_dev_auth(sid, client_token, server_token) AS true; +SET GLOBAL quack_authentication_function = 'quacktail_dev_auth'; +``` -## SQL surface (Tailscale only) +**Not for production.** See [Quack developer mode](https://duckdb.org/docs/current/quack/security#example-developer-mode-always-allow). -Invoke with **`CALL`**, like Quack: +--- -| Command | Purpose | -|---------|---------| -| `CALL tailscale_up(...)` | Blocking join; `authkey` or `TS_AUTHKEY`; optional `state_dir`, `control_url`, `ephemeral` | -| `CALL tailscale_login(...)` | Background join; returns `login_url` | -| `CALL tailscale_login_status()` | Poll `status`, `login_url`, tailnet IPs | -| `CALL tailscale_status()` | Linked?, running, hostname, IPs | +## End-to-end checklist -## Environment variables +**Each long-lived server** -| Variable | Effect | -|----------|--------| -| `TS_AUTHKEY` | Tailscale auth key if not passed in `CALL tailscale_up` | -| `TSNET_FORCE_LOGIN` | Force interactive login even if an auth key is set (rare) | +1. `export TS_AUTHKEY` (or Headscale preauth key) and `export QUACK_TAILNET_TOKEN` +2. `LOAD quack; LOAD quackscale;` +3. `CALL tailscale_up(...)` with persistent `state_dir` +4. Optional: `SET GLOBAL quack_authentication_function` (Modes 2–3) +5. `CALL quack_serve(..., token => quack_token()); CALL tailscale_serve_local(port => 9494);` +6. Do **not** call `tailscale_down()` on steady-state servers + +**Each one-shot client** -**Quack tokens are separate:** `QUACK_TAILNET_TOKEN` / `QUACK_TOKEN` — see [QUACK_AUTH.md](QUACK_AUTH.md). +1. Same `QUACK_TAILNET_TOKEN` available for secrets / `quack_query` +2. `LOAD quackscale; CALL tailscale_up(...); CALL tailscale_quack_forward(...);` +3. `LOAD quack; CREATE SECRET ...;` then query / attach +4. `DETACH remote; SELECT 'done'; CALL tailscale_down();` — required or the process hangs + +--- -## Security notes +## Security -- Treat `TS_AUTHKEY` like any infrastructure secret. -- Tailnet [ACLs](https://tailscale.com/kb/1018/acls) should restrict who can reach peer TCP **9494** (Quack). -- QuackScale advertises `quack:` URIs; it does not replace Quack’s application-level auth. +- Rotate `QUACK_TAILNET_TOKEN` like an API key; update servers and clients together +- Restrict tailnet ACLs to who may reach peer TCP **9494** +- `allow_other_hostname => true` is for tailnet binds — do not expose raw Quack on the public internet without TLS in front ([Quack exposure model](https://duckdb.org/docs/current/quack/security#exposure-model)) -## Related reading +## References -- [QUACK_AUTH.md](QUACK_AUTH.md) — Quack / QuackTail application tokens -- [HEADSCALE.md](HEADSCALE.md) — self-hosted Headscale -- [libtailscale](https://github.com/tailscale/libtailscale) -- [Headscale](https://github.com/juanfont/headscale) +- [Quack security](https://duckdb.org/docs/current/quack/security) +- [Quack overview — Authentication](https://duckdb.org/docs/current/quack/overview#authentication) - [Tailscale auth keys](https://tailscale.com/kb/1085/auth-keys) +- [Headscale docs](https://headscale.net/) diff --git a/docs/DEVELOPMENT.md b/docs/DEVELOPMENT.md new file mode 100644 index 0000000..82d5a47 --- /dev/null +++ b/docs/DEVELOPMENT.md @@ -0,0 +1,111 @@ +# Development + +This document is for **extension contributors** — building QuackScale, updating DuckDB, and CI. Integrators should read [GUIDE.md](GUIDE.md) and [AUTHENTICATION.md](AUTHENTICATION.md). + +## What QuackScale is + +QuackScale is a DuckDB community extension embedding [libtailscale](https://github.com/tailscale/libtailscale) so a DuckDB process can join a tailnet and reach the [Quack](https://duckdb.org/docs/current/quack/overview) HTTP protocol on tailnet addresses. + +QuackScale does **not** reimplement Quack. It provides tailnet lifecycle SQL, a localhost forwarder for Quack clients, and helpers such as `attach_ducklake`. + +```text +DuckDB + quackscale + libtailscale + → tailscale_up, tailscale_quack_forward, quack_uri, attach_ducklake +DuckDB + quack (core) + → quack_serve, ATTACH, quack_query +``` + +## Build + +Prerequisites: C++17, cmake, ninja or make, Go 1.25+ (CGO), git submodules. + +```sh +git clone --recurse-submodules https://github.com/quackscience/duckdb-quackscale.git +cd duckdb-quackscale +GEN=ninja make release +``` + +Artifacts: + +- `build/release/duckdb` +- `build/release/extension/quackscale/quackscale.duckdb_extension` + +Disable libtailscale (stub build): + +```sh +make CMAKE_VARS="-DQUACKSCALE_WITH_TAILSCALE=OFF" +``` + +Docker Compose images build from source by default — see [examples/Dockerfile](../examples/Dockerfile) and `.dockerignore`. + +## Repository layout + +```text +cmake/Libtailscale.cmake Go c-archive build + Go 1.25.5 bootstrap +third_party/libtailscale/ git submodule +src/ C++ extension (bridge, forwarder, attach_ducklake) +scripts/e2e/ Compose entrypoint, bootstrap, verify-image +examples/ Docker Compose two-node demo +duckdb/ DuckDB submodule +extension-ci-tools/ Extension build makefile submodule +``` + +## libtailscale integration + +- Built with `go build -buildmode=c-archive` → `libtailscale.a` +- C API: `tailscale_up`, `tailscale_dial`, `tailscale_close`, etc. +- CMake option `QUACKSCALE_WITH_TAILSCALE` (default ON) +- Ubuntu Docker builder needs `build-essential` and `patch` for the libtailscale patch step + +## Updating DuckDB + +When bumping the DuckDB target: + +1. Update `./duckdb` submodule to the latest stable tag +2. Update `./extension-ci-tools` to the branch matching that DuckDB version (e.g. `v1.5.3`) +3. Update `duckdb_version` in [MainDistributionPipeline.yml](../.github/workflows/MainDistributionPipeline.yml) +4. Rebuild — the DuckDB C++ API is not stable; fix compile breaks using [release notes](https://github.com/duckdb/duckdb/releases) and core extension patches + +## CI workflows + +| Workflow | Trigger | Purpose | +|----------|---------|---------| +| [headscale-e2e.yml](../.github/workflows/headscale-e2e.yml) | **Manual only** | Release-binary two-node e2e (no source build) | +| [headscale-integration.yml](../.github/workflows/headscale-integration.yml) | PR | Source build + Headscale smoke | +| [Release.yml](../.github/workflows/Release.yml) | Release published | Build linux release tarball | +| [libtailscale-integration.yml](../.github/workflows/libtailscale-integration.yml) | PR | libtailscale `go test` | +| [MainDistributionPipeline.yml](../.github/workflows/MainDistributionPipeline.yml) | PR | Extension distribution CI | + +**E2e never runs on push/PR** and never compiles DuckDB in CI — use `workflow_dispatch` on `headscale-e2e` with a release tag. Full DuckLake compose demo is local dev only (`scripts/ci_compose_e2e.sh`). + +## Roadmap (selected) + +| Item | Status | +|------|--------| +| `tailscale_up`, `tailscale_quack_forward`, `tailscale_down` | Done | +| `attach_ducklake` (Tier 2 remote lake views) | Done | +| Headscale + Compose e2e | Done | +| `ATTACH … TYPE quacktail_lake` (Tier 3 native catalog) | Planned | +| `ducklake_discover()` enriched discovery | Planned | +| `quackscale_serve()` one-call server bootstrap | Planned | +| Community extension descriptor publish | Planned | + +## Risks + +| Risk | Mitigation | +|------|------------| +| Large binary (Go runtime) | Document size; `QUACKSCALE_WITH_TAILSCALE=OFF` stub | +| Quack API churn | Pin DuckDB; integration tests against pinned quack | +| Secrets in SQL | Env / orchestrator secrets — see [AUTHENTICATION.md](AUTHENTICATION.md) | + +## Tests + +```sh +make test +``` + +SQL unit tests do not require a live tailnet. E2e: [test/e2e/README.md](../test/e2e/README.md), [examples/README.md](../examples/README.md). + +## License + +MIT (extension template). libtailscale is [BSD-3-Clause](https://github.com/tailscale/libtailscale/blob/main/LICENSE). diff --git a/docs/DUCKLAKE_TAILNET.md b/docs/DUCKLAKE_TAILNET.md deleted file mode 100644 index 6a516bf..0000000 --- a/docs/DUCKLAKE_TAILNET.md +++ /dev/null @@ -1,38 +0,0 @@ -# DuckLake over QuackTail (planned) - -Goal: serve DuckLake (or SQLite / Postgres-backed catalogs) on a node via **Quack**, reachable only on the **Headscale tailnet**, with **discovery** similar to `quack_discover()`. - -## Target architecture - -```text -┌─────────────────────┐ tailnet ┌─────────────────────┐ -│ quacktail-client │ ◄──────────────► │ quacktail-server │ -│ ATTACH quack:… │ │ quack_serve │ -│ quack_discover() │ │ ATTACH ducklake:… │ -└─────────────────────┘ │ (lake catalog) │ - └─────────────────────┘ -``` - -1. **Server** joins tailnet, runs `quack_serve`, attaches local DuckLake (or other catalog) as the served DuckDB session catalog. -2. **Client** joins tailnet, `quack_discover()` finds `quack::9494`, `ATTACH`es, queries `remote..`. -3. **Discovery extension** (future QuackScale work): advertise DuckLake URIs alongside Quack URIs, e.g. columns `listen_uri`, `catalog_type` (`quack`, `ducklake`, `sqlite`), `attach_hint`. - -## Constraints (today) - -- **Quack streaming-scan limit** — one remote read or write per SQL statement on an attached catalog; see [QUACK_STREAMING.md](QUACK_STREAMING.md). DuckLake workloads often use separate statements or server-side execution, so parallelism is less blocked than multi-scan single statements on ATTACH. -- **Nested catalogs** — Quack ATTACH exposes the server's session catalogs; deep names like `remote.lake.schema.table` may need `quack_query()` until Quack nested-catalog support lands ([duckdb#22605](https://github.com/duckdb/duckdb/issues/22605)). - -## Demo recipe (next step after compose e2e) - -1. Server bootstrap SQL: `INSTALL ducklake; LOAD ducklake; ATTACH 'ducklake:…' AS lake …;` then `quack_serve`. -2. Client: same compose flow as today; `SELECT * FROM remote.lake.main.my_table LIMIT 5`. -3. CI: extend `scripts/ci_headscale_e2e.sh` with optional DuckLake profile (Postgres or local metadata). - -## QuackScale changes (not in core `quack`) - -| Piece | Owner | Notes | -|-------|--------|------| -| Tailnet join, `quack_uri`, `quack_discover` | quackscale | Done | -| Compose / Headscale demo | quackscale | Done | -| `ducklake_discover()` or enriched `quack_discover` | quackscale | TBD — metadata from server whoami / config | -| Quack multi-scan planner | duckdb-quack | Upstream | diff --git a/docs/GUIDE.md b/docs/GUIDE.md new file mode 100644 index 0000000..a55cff6 --- /dev/null +++ b/docs/GUIDE.md @@ -0,0 +1,382 @@ +# QuackTail integration guide + +QuackTail combines: + +1. **Tailscale or Headscale** — private mesh between nodes +2. **Quack** — DuckDB’s HTTP protocol (`quack:` URIs, port **9494**) +3. **QuackScale** — joins DuckDB to the tailnet and forwards Quack across it +4. **DuckLake** (optional) — lakehouse catalog + Parquet on a QuackTail node + +QuackScale does **not** replace Quack or DuckLake. It makes them reachable on MagicDNS / `100.x.x.x` without exposing the public internet. + +Credentials: [AUTHENTICATION.md](AUTHENTICATION.md). Build and SQL reference: [README.md](../README.md). + +--- + +## Mental model + +```text +┌─────────────────────────────────────────────────────────────────┐ +│ quacktail-server (long-lived) │ +│ tailscale_up → quack_serve(127.0.0.1:9494) → tailscale_serve_local +│ optional: ATTACH ducklake:… AS lake (local or s3:// Parquet) │ +└───────────────────────────────┬─────────────────────────────────┘ + │ tailscale_dial (encrypted) +┌───────────────────────────────▼─────────────────────────────────┐ +│ quacktail-client (job, laptop, container) │ +│ tailscale_up → tailscale_quack_forward → quack:127.0.0.1:19494 │ +│ quack_query / attach_ducklake / ATTACH quack AS remote │ +│ tailscale_down() at end of one-shot sessions │ +└─────────────────────────────────────────────────────────────────┘ +``` + +**Why `tailscale_quack_forward`?** Quack uses normal HTTP/TCP. Embedded tsnet does not route kernel TCP to tailnet IPs. The forwarder listens on loopback and dials the peer via `tailscale_dial`. + +**Why `tailscale_down`?** `tailscale_up` and the forwarder start background threads. One-shot DuckDB processes **hang after SQL finishes** unless tsnet is shut down. + +--- + +## Choose a pattern + +```text +Remote DuckDB tables (CRUD, dashboards)? + └─► Pattern A: ATTACH 'quack:…' AS remote + +Lakehouse tables (Parquet, server owns all files)? + └─► Pattern B+: CALL attach_ducklake(...) then SELECT FROM lake.* + +Lakehouse (shared Parquet — S3, NFS, identical mount on each reader)? + └─► Pattern C: ATTACH 'ducklake:quack:…' AS lake (DATA_PATH 's3://…') + +Both operational tables + lake on one node? + └─► Pattern D: Lake queries first, then ATTACH quack AS remote (separate statements) +``` + +| Pattern | Client SQL | Parquet location | Best for | +|---------|------------|------------------|----------| +| **A — Quack attach** | `ATTACH 'quack:…' AS remote` | Server DuckDB / memory | Shared tables, multi-writer Quack | +| **B — quack_query** | `quack_query(uri, 'SELECT … FROM lake.t')` | Server-only | Fallback when `attach_ducklake` unavailable | +| **B+ — attach_ducklake** | `CALL attach_ducklake(...)` then `SELECT … FROM lake.t` | Server-only | **Preferred** for server-owned lakes | +| **C — ducklake:quack** | `ATTACH 'ducklake:quack:…' (DATA_PATH '…')` | Shared store / mount | Many readers with object-store access | +| **D — Hybrid** | B/B+ then A in same session | Mixed | Ops tables + lake on one tailnet node | + +**Common mistake:** `SELECT * FROM remote.lake.inventory` — plain Quack attach exposes the **primary catalog only**, not nested DuckLake databases. Use B, B+, or C. + +--- + +## Standard client connection recipe + +This sequence is what the [Compose demo](../examples/README.md) proves: + +```sql +LOAD quackscale; + +CALL tailscale_up( + hostname => 'my-client', + control_url => 'http://headscale:8080', -- omit for Tailscale SaaS + authkey => '…', + state_dir => '/tmp/client-tailscale', + ephemeral => true +); + +CALL tailscale_quack_forward( + host => 'quacktail-server', + port => 9494, + local_port => 19494 +); +CALL tailscale_ping(host => 'quacktail-server', port => 9494); -- optional + +LOAD quack; +CREATE SECRET ( + TYPE quack, + TOKEN 'your-shared-token', + SCOPE 'quack:127.0.0.1:19494' +); + +FROM quack_query( + 'quack:127.0.0.1:19494', + 'SELECT 1 AS probe', + token => 'your-shared-token', + disable_ssl => true +); + +-- Pattern B+, A, C, or D statements here … + +DETACH remote; -- if Pattern A used +SELECT 'CLIENT_DEMO_DONE' AS status; -- before tailscale_down (compose watchdog) +CALL tailscale_down(); +``` + +--- + +## Use case 1 — Remote DuckDB hub (Pattern A) + +**Story:** A central DuckDB node serves live tables to analysts and services on the tailnet. + +### Server (long-lived) + +```sql +LOAD quack; +LOAD quackscale; + +CALL tailscale_up(hostname => 'analytics-hub', state_dir => '/var/lib/quacktail/hub', …); + +CREATE TABLE IF NOT EXISTS events (id INTEGER, payload VARCHAR, ts TIMESTAMP); + +CALL quack_serve( + 'quack:127.0.0.1:9494', + allow_other_hostname => true, + token => quack_token() +); +CALL tailscale_serve_local(port => 9494); + +FROM quack_discover(); +``` + +Run under systemd, Kubernetes, or the `quacktail-server` container. **Do not** call `tailscale_down()`. + +### Client + +```sql +LOAD quack; +LOAD quackscale; + +CALL tailscale_up(…); +CALL tailscale_quack_forward(host => 'analytics-hub', port => 9494, local_port => 19494); + +CREATE SECRET (TYPE quack, TOKEN '…', SCOPE 'quack:127.0.0.1:19494'); + +ATTACH 'quack:127.0.0.1:19494' AS remote (TYPE quack); + +SELECT * FROM remote.events ORDER BY ts DESC LIMIT 10; + +DETACH remote; +CALL tailscale_down(); +``` + +--- + +## Use case 2 — DuckLake on the server (Patterns B / B+) + +**Story:** One node holds the DuckLake catalog and Parquet; tailnet clients query without copying files. + +### Server + +```sql +LOAD quack; +LOAD ducklake; +LOAD quackscale; + +CALL tailscale_up(hostname => 'lake-server', …); + +ATTACH 'ducklake:/data/lake/metadata/warehouse.ducklake' AS lake ( + DATA_PATH '/data/lake/parquet/' +); +-- Or: DATA_PATH 's3://bucket/prefix/' with httpfs secrets on the server + +CREATE TABLE IF NOT EXISTS inventory (item_id INTEGER, quantity INTEGER); +INSERT INTO inventory VALUES (101, 50); + +CALL quack_serve('quack:127.0.0.1:9494', allow_other_hostname => true, token => quack_token()); +CALL tailscale_serve_local(port => 9494); +``` + +### Client — `attach_ducklake` (preferred) + +Creates local **views** that delegate to the server via `quack_query`: + +```sql +LOAD quack; +LOAD quackscale; + +CALL tailscale_up(…); +CALL tailscale_quack_forward(host => 'lake-server', port => 9494, local_port => 19494); + +CREATE SECRET (TYPE quack, TOKEN '…', SCOPE 'quack:127.0.0.1:19494'); + +CALL attach_ducklake( + 'quack:127.0.0.1:19494', + remote_catalog => 'lake', + alias => 'lake', + token => '…', + disable_ssl => true +); + +SELECT * FROM lake.inventory ORDER BY item_id; +``` + +**How it works:** discovers tables on the server with `quack_query` → `duckdb_tables()`, then `CREATE VIEW lake.t AS FROM quack_query(..., 'SELECT * FROM lake.t', ...)`. + +**Limits:** read-only through views; no predicate pushdown; re-run after server schema changes. + +### Client — raw `quack_query` (fallback) + +```sql +FROM quack_query( + 'quack:127.0.0.1:19494', + 'SELECT * FROM lake.inventory ORDER BY item_id', + token => '…', + disable_ssl => true +); +``` + +Run lake SQL **before** `ATTACH 'quack:…' AS remote` in the same session. + +**Why not `ATTACH 'ducklake:quack:…'` here?** That pattern needs client-side `DATA_PATH` to resolve Parquet. When files live only on the server volume, reads hang or return empty. Use B/B+ instead. + +--- + +## Use case 3 — Shared Parquet (Pattern C) + +**Story:** Catalog metadata flows over Quack; every reader reads Parquet from a **shared** path ([DuckDB 1.5.3 pattern](https://duckdb.org/2026/05/20/announcing-duckdb-153.html)). + +### Server (catalog only) + +```sql +LOAD quack; +LOAD quackscale; + +CALL tailscale_up(…); +CALL quack_serve('quack:127.0.0.1:9494', allow_other_hostname => true, token => quack_token()); +CALL tailscale_serve_local(port => 9494); +``` + +### Client + +```sql +LOAD ducklake; +LOAD quack; +LOAD quackscale; + +CALL tailscale_up(…); +CALL tailscale_quack_forward(host => 'lake-catalog', port => 9494, local_port => 19494); + +CREATE SECRET (TYPE quack, TOKEN '…', SCOPE 'quack:127.0.0.1:19494'); + +ATTACH 'ducklake:quack:127.0.0.1:19494' AS warehouse ( + DATA_PATH 's3://my-bucket/lake/parquet/' +); + +SELECT * FROM warehouse.inventory; +CALL tailscale_down(); +``` + +**Requirements:** `DATA_PATH` reachable from **each client**; configure `httpfs` / cloud secrets on clients for `s3://`. + +--- + +## Use case 4 — Hybrid hub (Pattern D) + +Same tailnet node serves operational Quack tables **and** a DuckLake catalog: + +1. Lake reads: `attach_ducklake` or `quack_query` +2. Operational tables: `ATTACH 'quack:…' AS remote` +3. **One remote Quack read/write per SQL statement** (see limitations below) +4. End one-shot clients with `DETACH remote; CALL tailscale_down()` + +--- + +## Finding peers + +| Method | Status | Notes | +|--------|--------|-------| +| **`tailscale_quack_forward(host => '…')`** | Use today | Returns `quack_uri` for a known hostname | +| **`FROM quack_discover()`** on server | Use today | URIs this node advertises | +| **Config / DNS** | Use today | Stable Headscale hostnames in Helm, compose | +| **`quack_query(…, 'FROM quack_discover()')`** | Avoid | Can deadlock on the server | + +**Fleet pattern:** deploy nodes with stable hostnames (`analytics-hub`, `lake-server`); clients call `tailscale_quack_forward(host => 'analytics-hub', …)`. + +**Multiple servers:** use different local ports (`19494`, `19495`) or sequential sessions with `tailscale_down()` between peers. + +**Multiple lakes on one server:** attach each with a distinct alias; clients query with fully qualified names in `quack_query` or `attach_ducklake`. + +--- + +## Production notes + +### Object storage + +| Role | Approach | +|------|----------| +| Server owns lake | `DATA_PATH 's3://…'` in server `ATTACH ducklake`; clients use Pattern B+ | +| Readers with credentials | Pattern C — each client `ATTACH 'ducklake:quack:…' (DATA_PATH 's3://…')` | + +### Lifecycle + +| Deployment | `state_dir` | `tailscale_down` | +|------------|-------------|-------------------| +| Long-lived server | Persistent volume | Never on steady state | +| Cron / CI / compose client | Ephemeral OK | Always at end | + +DuckLake metadata: file (`*.ducklake`), Postgres, or DuckDB — see [DuckLake attach](https://duckdb.org/docs/stable/duckdb/attach). + +### Observability + +- Server: `CALL tailscale_status()`, `/work/server.log` in compose +- Client: `/work/client.out` in compose +- Readiness: `CALL tailscale_ping(host => 'peer', port => 9494)` before heavy queries + +--- + +## Limitations and workarounds + +| Issue | Workaround | +|-------|------------| +| `remote.lake.table` does not exist | Use `attach_ducklake`, `quack_query`, or `ducklake:quack:` | +| Client hangs after SQL completes | Emit done marker, then `CALL tailscale_down()` | +| Kernel TCP to `100.x:9494` fails from tsnet client | Use `tailscale_quack_forward` | +| `quack_query` + `ATTACH remote` stalls | Run lake queries **before** attach; separate statements | +| `quack_query(…, quack_discover())` hangs | Discover locally or use known hostname | + +### Quack “multiple streaming scans” + +This is a **core `quack` extension** limit, not QuackScale. A single SQL statement cannot perform more than one streaming read (or read + write) on the same attached Quack catalog. + +Fails: + +```sql +INSERT INTO remote.t SELECT 1 WHERE NOT EXISTS (SELECT 1 FROM remote.t); +``` + +Works (separate statements): + +```sql +INSERT INTO remote.t VALUES (1, 'x') ON CONFLICT DO NOTHING; +SELECT * FROM remote.t; +``` + +Upstream: [duckdb/duckdb#22605](https://github.com/duckdb/duckdb/issues/22605). Split statements or use `quack_query` for one-off remote SQL. + +--- + +## Runnable demos + +| Demo | Command | +|------|---------| +| **Two-node cluster + DuckLake** | [examples/README.md](../examples/README.md) | +| **DuckLake compose details** | [examples/ducklake/README.md](../examples/ducklake/README.md) | +| **Host DuckDB → compose stack** | `scripts/local_remote_headscale_test.sh` | +| **Network-only probe** | `docker compose --profile debug run --rm tailscale-probe` | + +```bash +git submodule update --init --recursive +cd examples +docker compose build quacktail-server quacktail-client +docker compose run --rm --entrypoint /usr/local/bin/quacktail-verify-image.sh quacktail-client +docker compose up -d --force-recreate headscale quacktail-server +docker compose --profile test run --rm quacktail-client +``` + +Expect `LAKE_PASSED`, `PASSED`, and `✓ Demo passed`. + +--- + +## Further reading + +| Resource | Topic | +|----------|--------| +| [AUTHENTICATION.md](AUTHENTICATION.md) | Tailnet + Quack credentials | +| [DEVELOPMENT.md](DEVELOPMENT.md) | Extension architecture and roadmap | +| [Quack overview](https://duckdb.org/docs/current/quack/overview) | Upstream Quack protocol | +| [DuckLake docs](https://duckdb.org/docs/stable/duckdb/ducklake) | Catalog, Parquet, attach | diff --git a/docs/HEADSCALE.md b/docs/HEADSCALE.md deleted file mode 100644 index 95a94f1..0000000 --- a/docs/HEADSCALE.md +++ /dev/null @@ -1,90 +0,0 @@ -# Headscale (self-hosted control plane) - -[Headscale](https://github.com/juanfont/headscale) is an open-source, self-hosted implementation of the **Tailscale control server**. Tailscale clients (and embedded **tsnet** / [libtailscale](https://github.com/tailscale/libtailscale)) treat it like Tailscale SaaS: you point at a custom **login server** URL and register with a **preauth key** or browser flow. - -QuackScale adds **no Headscale-specific code**. The integration curve is effectively **zero**: use `control_url` on `CALL tailscale_up` / `CALL tailscale_login` the same way you would pass `tailscale up --login-server`. - -| Topic | Doc | -|-------|-----| -| Tailscale SaaS auth keys, browser login | [AUTHENTICATION.md](AUTHENTICATION.md) | -| Quack HTTP tokens on the tailnet | [QUACK_AUTH.md](QUACK_AUTH.md) | -| Example SQL | [../examples/headscale_quacktail.sql](../examples/headscale_quacktail.sql) | - -## Mapping: Tailscale CLI → QuackScale - -| Tailscale CLI | QuackScale | -|---------------|------------| -| `tailscale up --login-server https://hs.example.com` | `control_url => 'https://hs.example.com'` | -| `--authkey tskey-...` or Headscale preauth key | `authkey => '...'` or `TS_AUTHKEY` | -| `--hostname` | `hostname => '...'` | -| state under `~/.local/share/tailscale` | `state_dir => '...'` | - -Headscale preauth keys are created with `headscale preauthkeys create` (not the Tailscale admin UI). See [Headscale — Getting started](https://headscale.net/stable/usage/getting-started/). - -## Minimal server setup - -1. Install and configure Headscale ([releases](https://github.com/juanfont/headscale/releases), [docs](https://headscale.net/)). -2. Set `server_url` in `config.yaml` to the URL **clients** use (e.g. `https://headscale.my.net`). -3. Create a user and reusable key: - -```sh -headscale users create quackscale -headscale preauthkeys create --user 1 --reusable --expiration 168h -``` - -4. On each DuckDB host: - -```sh -export HEADSCALE_URL='https://headscale.my.net' -export HEADSCALE_PREAUTH_KEY='' -export QUACK_TAILNET_TOKEN='' -./build/release/duckdb -``` - -```sql -LOAD quack; -LOAD quackscale; - -CALL tailscale_up( - hostname => 'duckdb-node-a', - control_url => getenv('HEADSCALE_URL'), - authkey => getenv('HEADSCALE_PREAUTH_KEY'), - state_dir => '/var/lib/duckdb/headscale-state' -); - -CALL quack_serve(quack_uri(), allow_other_hostname => true, token => quack_token()); -``` - -## Docker (lab / CI) - -Uses the [official Headscale container](https://headscale.net/stable/setup/install/container/) (`docker.io/headscale/headscale:0.28.0`) — no custom images. Started after checkout on Docker network `quacktail-ci` with hostname alias **`headscale`**. Control URL: **`http://headscale:8080`**. - -```sh -export HEADSCALE_CI_ROOT=$PWD -source scripts/lib/headscale_ci.sh -headscale_ci_start /tmp/headscale-data -./scripts/ci_headscale_smoke.sh # after make release -headscale_ci_stop -``` - -## CI in this repository - -| Workflow | What it validates | -|----------|-------------------| -| [libtailscale-integration.yml](../.github/workflows/libtailscale-integration.yml) | libtailscale `go test` (tstestcontrol) | -| [headscale-integration.yml](../.github/workflows/headscale-integration.yml) | Headscale + `CALL tailscale_up` smoke | -| [headscale-e2e.yml](../.github/workflows/headscale-e2e.yml) | Two-node QuackTail e2e (manual; uses [release](../.github/workflows/Release.yml) binary) | -| [Release.yml](../.github/workflows/Release.yml) | Build linux `duckdb` + quackscale on **Release published** | - -## Notes - -- **DERP / NAT**: Headscale can use public Tailscale DERP relays (`derp.urls` in config) or your own; mesh connectivity depends on your network, not QuackScale. -- **TLS**: Production `server_url` should be `https://…`; lab CI uses plain `http://127.0.0.1:8080`. -- **MagicDNS**: Optional; `quack_uri()` prefers MagicDNS when Headscale provides it, else tailnet IP. -- Headscale is **not** affiliated with Tailscale Inc.; QuackScale links both projects as compatible stacks for QuackTail. - -## Related - -- [Headscale repository](https://github.com/juanfont/headscale) -- [Tailscale tsnet](https://tailscale.com/kb/1522/tsnet-server) -- [AUTHENTICATION.md](AUTHENTICATION.md) diff --git a/docs/PLAN.md b/docs/PLAN.md deleted file mode 100644 index 95fbaaf..0000000 --- a/docs/PLAN.md +++ /dev/null @@ -1,150 +0,0 @@ -# QuackScale — Research & Implementation Plan - -QuackScale is a DuckDB **community extension** that embeds [libtailscale](https://github.com/tailscale/libtailscale) so a DuckDB process can join a tailnet and expose (or reach) the [Quack](https://duckdb.org/docs/current/quack/overview) remote protocol on tailnet addresses instead of only localhost. - -## Problem - -[Quack](https://duckdb.org/docs/current/quack/overview) turns DuckDB into an HTTP server (`quack_serve`) so other DuckDB clients can `ATTACH` or run `quack_query` remotely. By default, `quack_serve` only binds **localhost** unless `allow_other_hostname => true`, and production setups typically use a TLS reverse proxy. - -For private teams, binding Quack on a **Tailscale IP** gives: - -- Encrypted tailnet transport without exposing the service on the public internet -- Stable reachability via MagicDNS / tailnet IPs -- No custom wire protocol — Quack stays HTTP + `application/duckdb` - -QuackScale does **not** reimplement Quack. It brings the process onto the tailnet; Quack remains the core `quack` extension. - -## Architecture - -```mermaid -flowchart LR - subgraph process [DuckDB + QuackScale] - QS[tailscale_up / status] - LT[libtailscale / tsnet] - Q[quack extension] - QS --> LT - Q --> HTTP[HTTP :9494] - LT --> TSIP[tailnet IP] - HTTP --> TSIP - end - Client[DuckDB client] -->|quack:100.x.x.x:9494| TSIP -``` - -### Component roles - -| Component | Role | -|-----------|------| -| **libtailscale** | Userspace Tailscale (tsnet): auth, tailnet IP, listen/dial on tailnet | -| **QuackScale** | C++ extension: lifecycle SQL API, tailnet IP → `quack:` URI helpers | -| **Quack (core)** | HTTP server, serialization, attach/catalog — unchanged | - -### Target user flow (phase 1 — manual compose) - -```sql -LOAD quack; -LOAD quackscale; - --- Join tailnet (authkey via env/secret in production) -CALL tailscale_up(authkey => 'tskey-auth-...', hostname => 'analytics-duck-1'); - --- Advertise Quack on the tailnet (default port 9494) -CALL quack_discover(); - --- Expose Quack on tailnet with shared token (env: QUACK_TAILNET_TOKEN) -CALL quack_serve( - quack_uri(), - allow_other_hostname => true, - token => quack_token() -); - --- Remote client: same token via CREATE SECRET or TOKEN (see QUACK_AUTH.md) -ATTACH 'quack:analytics-duck-1:9494' AS remote (TYPE quack, DISABLE_SSL true); -``` - -Phase 2 may add `quackscale_serve()` that chains up + `quack_serve` in one call (needs stable inter-extension hooks or documented SQL orchestration). - -## Authentication (two layers) - -| Layer | Doc | -|-------|-----| -| **Tailscale** (node on tailnet) | [AUTHENTICATION.md](AUTHENTICATION.md) — `TS_AUTHKEY`, `CALL tailscale_up`, browser login | -| **Quack** (SQL over HTTP) | [QUACK_AUTH.md](QUACK_AUTH.md) — `QUACK_TAILNET_TOKEN`, `quack_token()`, `CREATE SECRET`, `quack_authentication_function` | - -QuackTail fleets should use a **shared Quack token** (or allowlist), not per-server random `auth_token` values from `quack_serve`. See [Quack — Overriding authentication](https://duckdb.org/docs/current/quack/security#overriding-authentication). - -## libtailscale integration notes - -- Built with `go build -buildmode=c-archive` → `libtailscale.a` + generated header (see [libtailscale README](https://github.com/tailscale/libtailscale)). -- C API: `tailscale_new`, `tailscale_up`, `tailscale_getips`, `tailscale_listen` / `tailscale_dial`, etc. ([`tailscale.h`](https://github.com/tailscale/libtailscale/blob/main/tailscale.h)). -- **Build requirement**: Go toolchain + CGO; CMake option `QUACKSCALE_WITH_TAILSCALE` (default ON). -- **CI implication**: extension distribution jobs must install Go; first bootstrapped CI may only run tests that do not call `tailscale_up` without credentials. - -### Risks - -| Risk | Mitigation | -|------|------------| -| Large binary (Go runtime) | Document size; optional `QUACKSCALE_WITH_TAILSCALE=OFF` stub build | -| macOS min OS (libtailscale Makefile targets 15.0 for some Swift paths) | CMake sets `MACOSX_DEPLOYMENT_TARGET=11.0` for archive build; validate in CI | -| Quack API churn (beta) | Pin DuckDB version; integration tests against pinned `quack` | -| Auth secrets in SQL | `TS_AUTHKEY` + `QUACK_TAILNET_TOKEN` via env/secrets; see [QUACK_AUTH.md](QUACK_AUTH.md) | - -## Quack protocol recap (relevant bits) - -From the [Quack overview](https://duckdb.org/docs/current/quack/overview): - -- HTTP(S), default port **9494**, URI scheme `quack:host:port` -- Server: `CALL quack_serve('quack:...', allow_other_hostname => true, token => '...')` -- Client: `ATTACH 'quack:host' AS db (TOKEN '...', DISABLE_SSL true)` for tailnet HTTP without public TLS -- Token auth via secrets or explicit `TOKEN` - -QuackScale’s job is **advertising** reachable **`quack::9494`** endpoints on the tailnet after `tailscale_up` — MagicDNS hostname first (for discovery), plus each tailnet IP. - -## SQL API (bootstrapped) - -| Function | Purpose | -|----------|---------| -| `CALL tailscale_status()` | Whether libtailscale is linked, running, hostname, tailnet IPs | -| `CALL tailscale_up(...)` | Join tailnet; named params: `hostname`, `authkey`, `control_url`, `state_dir`, `ephemeral` | -| `CALL quack_discover(port => 9494)` | All tailnet `quack:` URIs clients can use (default port **9494**) | -| `quack_uri()` | Scalar for `CALL quack_serve(quack_uri(), ...)` (hostname if set, else first IP; port **9494**) | -| `quack_token()` | Scalar — shared token from `QUACK_TAILNET_TOKEN` / `QUACK_TOKEN` env | - -Planned: - -- `tailscale_down()` — `tailscale_close` -- `quack_serve_on_tailnet(port, ...)` — orchestrate Quack when `quack` is loaded -- Settings: default port, auto-load `quack`, state directory -- [x] Headscale CI smoke (`scripts/ci_headscale_smoke.sh`, `.github/workflows/headscale-integration.yml`) -- [x] Two-node QuackTail e2e over Headscale ([`scripts/ci_headscale_e2e.sh`](../scripts/ci_headscale_e2e.sh), [`.github/workflows/headscale-e2e.yml`](../.github/workflows/headscale-e2e.yml)) - -## Repository layout - -``` -duckdb_tailscale/ -├── cmake/Libtailscale.cmake # Go c-archive build -├── third_party/libtailscale/ # git submodule -├── src/ -│ ├── quackscale_extension.cpp -│ └── tailscale_bridge.cpp -├── docs/PLAN.md -├── test/sql/quackscale.test -└── duckdb/ + extension-ci-tools/ # submodules -``` - -## Community extension checklist - -- [x] Fork/bootstrap [extension-template](https://github.com/duckdb/extension-template) -- [x] Rename to `quackscale` -- [x] libtailscale submodule + CMake -- [ ] Green `make` / `make test` locally -- [ ] Add Go to GitHub Actions (custom step or workflow env) -- [ ] PR to [community-extensions](https://github.com/duckdb/community-extensions) descriptor -- [ ] README: install `INSTALL quackscale FROM community` + Quack dependency docs -- [ ] Security section: tailnet ACLs, tokens, no Funnel unless explicit - -## Phased delivery - -1. **Bootstrap (current)** — template, libtailscale link, status/up/uri SQL, plan doc -2. **Quack glue** — docs + example script; optional `quackscale_serve` wrapper -3. **CI hardening** — Go in matrix, optional e2e with test auth key -4. **Community release** — descriptor, versioning aligned with DuckDB 1.5.x + Quack beta diff --git a/docs/QUACK_AUTH.md b/docs/QUACK_AUTH.md deleted file mode 100644 index fd4cec4..0000000 --- a/docs/QUACK_AUTH.md +++ /dev/null @@ -1,258 +0,0 @@ -# Quack authentication on a tailnet (QuackTail) - -This document covers **only Quack** — HTTP application tokens after your node is already on the tailnet. - -For **Tailscale node login** (`TS_AUTHKEY`, browser URL, `state_dir`), see **[AUTHENTICATION.md](AUTHENTICATION.md)**. - -| Doc | Topic | -|-----|--------| -| [AUTHENTICATION.md](AUTHENTICATION.md) | Tailscale — `tailscale_up`, `TS_AUTHKEY`, browser login | -| [PLAN.md](PLAN.md) | Architecture and roadmap | -| [../README.md](../README.md) | End-to-end quick start | - ---- - -## Goal: semi-automatic QuackTail peers - -You want DuckDB servers and clients on the **same tailnet** to: - -1. Find each other via `quack:hostname:9494` (QuackScale + MagicDNS). -2. Authenticate to Quack **without** copying a new random `auth_token` from every `CALL quack_serve`. - -That is supported today using Quack’s built-in **`token =>`** parameter and **[Overriding authentication](https://duckdb.org/docs/current/quack/security#overriding-authentication)** — no QuackScale changes to Quack’s wire protocol. - -## Two layers (do not merge them) - -| Layer | Proves | You configure | -|-------|--------|----------------| -| **Tailscale** | Machine is on the tailnet | `TS_AUTHKEY`, `CALL tailscale_up`, ACLs — [AUTHENTICATION.md](AUTHENTICATION.md) | -| **Quack** | Caller may use this DuckDB session over HTTP | `QUACK_TAILNET_TOKEN`, `CREATE SECRET`, or custom `quack_authentication_function` | - -A host on your tailnet is **not** automatically trusted for SQL. Tailscale is necessary but not sufficient. - -## What Quack does by default (and why you override it) - -From [Quack security — Default configuration](https://duckdb.org/docs/current/quack/security#default-configuration): - -1. `CALL quack_serve(...)` **generates a random token** and returns it in the `auth_token` column — unless you pass `token => '...'`. -2. The default hook `quack_check_token` requires **client token == server token** for that listener. - -That default is fine for a single local experiment. For a **fleet** of QuackTail nodes, you want: - -- The **same** token on every server and client (env / secret manager), **or** -- A **shared allowlist** of valid tokens (SQL table + custom macro). - -QuackScale’s `quack_token()` only helps read a shared token from the environment on the **server** side. Clients still use `CREATE SECRET` or `TOKEN` with the same value. - -## Environment variables (Quack layer) - -Set on **both** servers and clients (container env, systemd, K8s `Secret`, etc.): - -| Variable | Role | -|----------|------| -| `QUACK_TAILNET_TOKEN` | **Preferred** — shared Quack auth token (≥ 4 characters) | -| `QUACK_TOKEN` | Fallback alias if `QUACK_TAILNET_TOKEN` is unset | - -Keep **`TS_AUTHKEY`** separate — it is Tailscale-only ([AUTHENTICATION.md](AUTHENTICATION.md)). - -Example provisioning: - -```sh -export TS_AUTHKEY='tskey-auth-...' -export QUACK_TAILNET_TOKEN='your-shared-quack-secret' -duckdb -``` - ---- - -## Mode 1 — Single shared token (recommended for QuackTail) - -One secret, same everywhere. Matches default `quack_check_token` when server and client use the same string. - -### Server - -```sql -LOAD quack; -LOAD quackscale; - -CALL tailscale_up( - hostname => 'warehouse-a', - state_dir => '/var/lib/duckdb/tailscale' -); - -CALL quack_serve( - quack_uri(), - allow_other_hostname => true, - token => quack_token() -- reads QUACK_TAILNET_TOKEN / QUACK_TOKEN -); - -CALL quack_discover(); -``` - -`quack_token()` fails fast if the env var is missing or shorter than 4 characters (Quack’s minimum). - -Without the helper, pass the literal explicitly: - -```sql -CALL quack_serve(quack_uri(), allow_other_hostname => true, token => 'same-secret-as-clients'); -``` - -### Client — semi-automatic with `CREATE SECRET` - -Create the secret once per client process (inject the token from your shell/env when launching DuckDB — do not hardcode in shared SQL files): - -```sql -LOAD quack; - -CREATE SECRET ( - TYPE quack, - TOKEN 'your-shared-quack-secret', - SCOPE 'quack:warehouse-a:9494' -); - -ATTACH 'quack:warehouse-a:9494' AS warehouse ( - TYPE quack, - DISABLE_SSL true -); - -FROM warehouse.query('SELECT 42'); -``` - -- **`SCOPE`** must match how clients reach the server (`quack::9494`). Use the hostname from `CALL tailscale_up(hostname => 'warehouse-a')`. -- After the secret exists, `ATTACH` needs no `TOKEN` clause — Quack picks it up automatically ([overview — Authentication](https://duckdb.org/docs/current/quack/overview#authentication)). - -### Client — explicit token per attach - -```sql -ATTACH 'quack:warehouse-a:9494' AS warehouse ( - TYPE quack, - TOKEN 'your-shared-quack-secret', - DISABLE_SSL true -); -``` - -### Client — stateless `quack_query` - -```sql -FROM quack_query( - 'quack:warehouse-a:9494', - 'SELECT 42', - token => 'your-shared-quack-secret', - disable_ssl => true -); -``` - ---- - -## Mode 2 — Shared allowlist (multiple tokens / rotation) - -When several tokens should work across the fleet (teams, rotation, read-only clients), use Quack’s **[multi-token table](https://duckdb.org/docs/current/quack/security#example-multi-token-table)** pattern. - -Run once per DuckDB database (or in your bootstrap migration): - -```sql -CREATE TABLE quacktail_tokens ( - auth_token VARCHAR PRIMARY KEY, - label VARCHAR -); - -INSERT INTO quacktail_tokens VALUES - ('primary-team-token-2026', 'analytics'), - ('readonly-team-token-2026', 'readonly'); - -CREATE MACRO quacktail_check_token(sid, client_token, server_token) AS ( - EXISTS (SELECT 1 FROM quacktail_tokens WHERE auth_token = client_token) -); - -SET GLOBAL quack_authentication_function = 'quacktail_check_token'; -``` - -Important behavior ([Overriding authentication](https://duckdb.org/docs/current/quack/security#overriding-authentication)): - -| Argument | Meaning | -|----------|---------| -| `client_token` | What the client sent (`TOKEN`, secret, or `quack_query`) — **validate this** | -| `server_token` | From `quack_serve(token => ...)` — you may **ignore** it when using a table | - -Any token listed in `quacktail_tokens` is accepted on **every** server that uses this macro (unless you add per-server logic). - -Start the server with a token that satisfies Quack’s length check (still pass `token =>`): - -```sql -CALL quack_serve(quack_uri(), allow_other_hostname => true, token => quack_token()); -``` - -Clients use **their** token from the allowlist (via `CREATE SECRET` or `TOKEN`). - -Populate `quacktail_tokens` from your deployment tool at startup (INSERT from env). The auth callback runs in a **fresh transient connection** — it cannot read your shell env by itself. - ---- - -## Mode 3 — Developer mode (tailnet is the only gate) - -Only on isolated lab tailnets. Admits **all** Quack clients with no token check: - -```sql -CREATE MACRO quacktail_dev_auth(sid, client_token, server_token) AS true; -SET GLOBAL quack_authentication_function = 'quacktail_dev_auth'; -``` - -See [developer mode](https://duckdb.org/docs/current/quack/security#example-developer-mode-always-allow). **Not for production.** - ---- - -## End-to-end checklist - -**On each server** - -1. `export TS_AUTHKEY` and `export QUACK_TAILNET_TOKEN` -2. `LOAD quack; LOAD quackscale;` -3. `CALL tailscale_up(hostname => '...', state_dir => '...');` -4. (Optional) `SET GLOBAL quack_authentication_function` if using Mode 2 or 3 -5. `CALL quack_serve(quack_uri(), allow_other_hostname => true, token => quack_token());` - -**On each client** - -1. Same `QUACK_TAILNET_TOKEN` available when creating secrets or attaching -2. `LOAD quack;` -3. `CREATE SECRET (TYPE quack, TOKEN '...', SCOPE 'quack::9494');` -4. `ATTACH 'quack::9494' AS ... (TYPE quack, DISABLE_SSL true);` - -**Network** - -- Tailscale ACL: allow tagged nodes → TCP 9494 on peers -- Quack default port: **9494** ([overview](https://duckdb.org/docs/current/quack/overview)) - ---- - -## Comparison: default vs QuackTail - -| | Default Quack | QuackTail (Mode 1) | -|---|---------------|---------------------| -| Server token | Random per `quack_serve` | Fixed via `QUACK_TAILNET_TOKEN` / `quack_token()` | -| Client setup | Copy `auth_token` from server output | Same env secret or `CREATE SECRET` | -| Discovery | Manual URI | `CALL quack_discover()`, `quack_uri()` | -| Transport | Often localhost | Tailscale tailnet + `allow_other_hostname => true` | - ---- - -## What QuackScale does not do (yet) - -- Does not call `quack_serve` or install auth macros automatically — compose SQL after `CALL tailscale_up` -- Does not sync tokens over Tailscale — use env, K8s, Vault, etc. -- Planned: `quacktail_serve()` helper chaining tailnet up + shared token + `quack_serve` - ---- - -## Security notes - -- Rotate `QUACK_TAILNET_TOKEN` like an API key; update servers, client secrets, and `quacktail_tokens` together -- Use [Tailscale ACLs](https://tailscale.com/kb/1018/acls) to limit who reaches port 9494 -- `allow_other_hostname => true` is for tailnet binds — do not expose raw Quack to the public internet without a TLS reverse proxy ([Quack security — Exposure model](https://duckdb.org/docs/current/quack/security#exposure-model)) - -## References - -- [Quack security](https://duckdb.org/docs/current/quack/security) — overriding authentication & authorization -- [Quack overview — Authentication](https://duckdb.org/docs/current/quack/overview#authentication) -- [DuckDB secrets manager](https://duckdb.org/docs/current/configuration/secrets_manager) -- [Tailscale authentication (QuackScale)](AUTHENTICATION.md) diff --git a/docs/QUACK_STREAMING.md b/docs/QUACK_STREAMING.md deleted file mode 100644 index a5c988a..0000000 --- a/docs/QUACK_STREAMING.md +++ /dev/null @@ -1,65 +0,0 @@ -# Quack “Multiple streaming scans” limitation - -This is **not** a QuackScale (`quackscale`) limitation. It comes from the **core `quack` extension** shipped with DuckDB. - -## Source code - -[`duckdb-quack/src/storage/quack_optimizer.cpp`](https://github.com/duckdb/duckdb-quack/blob/main/src/storage/quack_optimizer.cpp) - -Before executing a query, `QuackOptimizer` walks the plan and counts, per Quack connection: - -- **streaming scans** — reads from attached Quack tables (`LogicalGet` on Quack scans) -- **writes** — `INSERT` / `CREATE TABLE AS` targeting a Quack catalog - -If `scans + inserts > 1` **within the same query**, it throws: - -```text -Not implemented Error: Multiple streaming scans or streaming scans + CTAS / insert in the same query are not currently supported -``` - -## What triggers it - -Any **single SQL statement** that both: - -1. reads from an attached Quack catalog (`remote.table`, `FROM remote…`, subqueries), and -2. writes to the same attached Quack catalog (`INSERT INTO remote…`, CTAS into `remote`) - -Examples that fail: - -```sql --- INSERT + correlated read in one statement -INSERT INTO remote.t -SELECT 1, 'x' -WHERE NOT EXISTS (SELECT 1 FROM remote.t WHERE id = 1); - --- Multiple remote reads in one statement (e.g. SHOW TABLES on nested Quack catalogs) -SHOW TABLES; -``` - -Examples that work (separate statements, one remote op each): - -```sql -ATTACH 'quack:host:9494' AS remote (TYPE quack, DISABLE_SSL true); - -INSERT INTO remote.t VALUES (1, 'x') -ON CONFLICT DO NOTHING; - -SELECT * FROM remote.t; -``` - -`CALL tailscale_up()` is a **local** QuackScale table function — it is **not** a Quack streaming scan and is not the cause of this error. - -## Upstream status - -- Reported in [duckdb/duckdb#22605](https://github.com/duckdb/duckdb/issues/22605) (remote catalog / `SHOW TABLES`). -- A community PR to lift the restriction ([duckdb/duckdb-quack#126](https://github.com/duckdb/duckdb-quack/pull/126)) was **not merged** as of May 2026 — maintainers want smaller, incremental changes. - -QuackScale cannot patch this inside `quackscale`; fixes belong in **`duckdb-quack`** (or query shape/workarounds in client SQL). - -## Demo / DuckLake guidance - -For attached remote writes: - -- Prefer plain `INSERT INTO remote.t VALUES (…)` or `ON CONFLICT DO NOTHING` for idempotency. -- Avoid `INSERT … SELECT … WHERE NOT EXISTS (SELECT … FROM remote.t)` in one statement. -- Split read and write into **separate SQL statements**, or use `quack_query(uri, '…')` for one-off remote SQL when ATTACH + DML in one plan is awkward. diff --git a/docs/README.md b/docs/README.md index c4be666..381aa4d 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,20 +1,36 @@ -# QuackScale documentation +# QuackTail documentation + +QuackTail is **DuckDB + Quack + QuackScale** on a private tailnet (Tailscale or Headscale). These docs are for **integrators** — operators wiring servers, clients, tokens, and lake catalogs — not for extension C++ development. ## Start here -1. **[../README.md](../README.md)** — build, quick start, SQL reference -2. **[AUTHENTICATION.md](AUTHENTICATION.md)** — Tailscale (`TS_AUTHKEY`, `tailscale_up`, browser login) -3. **[HEADSCALE.md](HEADSCALE.md)** — self-hosted [Headscale](https://github.com/juanfont/headscale) (`control_url`, preauth keys) -4. **[QUACK_AUTH.md](QUACK_AUTH.md)** — Quack tokens for QuackTail (`QUACK_TAILNET_TOKEN`, shared secrets, auth macros) -5. **[PLAN.md](PLAN.md)** — architecture, API roadmap, risks -6. **[../examples/README.md](../examples/README.md)** — Docker Compose two-node Headscale demo +| Document | Read when you need to… | +|----------|-------------------------| +| **[GUIDE.md](GUIDE.md)** | Pick a pattern, run use cases, connect clients, query DuckLake, avoid known pitfalls | +| **[AUTHENTICATION.md](AUTHENTICATION.md)** | Configure tailnet login, Headscale, and Quack HTTP tokens | +| **[../examples/README.md](../examples/README.md)** | Run the two-node Docker Compose demo | +| **[../README.md](../README.md)** | Build the extension from source and SQL command reference | + +## Extension developers + +| Document | Contents | +|----------|----------| +| **[DEVELOPMENT.md](DEVELOPMENT.md)** | Architecture, roadmap, updating DuckDB submodules, CI | + +## Quick orientation + +```text +Tailscale / Headscale → Is this machine on our mesh? +Quack token → May this caller run SQL over HTTP? +tailscale_quack_forward → Route Quack from embedded tsnet to 127.0.0.1 +quack_serve + serve_local → Expose DuckDB on the tailnet (:9494) +``` -## QuackTail authentication at a glance +Load both extensions in every session: -| Step | Layer | Action | -|------|--------|--------| -| 1 | Tailscale | `export TS_AUTHKEY=...` → `CALL tailscale_up(hostname => 'node-a', ...)` | -| 2 | Quack (server) | `export QUACK_TAILNET_TOKEN=...` → `CALL quack_serve(..., token => quack_token())` + `tailscale_serve_local` | -| 3 | Quack (client) | `CALL tailscale_quack_forward(...)` → `CREATE SECRET` → `ATTACH 'quack:127.0.0.1:19494'` | +```sql +LOAD quack; -- HTTP server, ATTACH, quack_query +LOAD quackscale; -- tailscale_up, forwarder, attach_ducklake, … +``` -Do **not** rely on the random `auth_token` column from default `quack_serve`. Use a **shared** token or [override `quack_authentication_function`](https://duckdb.org/docs/current/quack/security#overriding-authentication). +Do **not** copy the random `auth_token` from each `CALL quack_serve`. Use a **shared** fleet token — see [AUTHENTICATION.md](AUTHENTICATION.md). diff --git a/docs/UPDATING.md b/docs/UPDATING.md deleted file mode 100644 index a3ac73e..0000000 --- a/docs/UPDATING.md +++ /dev/null @@ -1,23 +0,0 @@ -# Extension updating -When cloning this template, the target version of DuckDB should be the latest stable release of DuckDB. However, there -will inevitably come a time when a new DuckDB is released and the extension repository needs updating. This process goes -as follows: - -- Bump submodules - - `./duckdb` should be set to latest tagged release - - `./extension-ci-tools` should be set to updated branch corresponding to latest DuckDB release. So if you're building for DuckDB `v1.1.0` there will be a branch in `extension-ci-tools` named `v1.1.0` to which you should check out. -- Bump versions in `./github/workflows` - - `duckdb_version` input in `duckdb-stable-build` job in `MainDistributionPipeline.yml` should be set to latest tagged release - - `duckdb_version` input in `duckdb-stable-deploy` job in `MainDistributionPipeline.yml` should be set to latest tagged release - - the reusable workflow `duckdb/extension-ci-tools/.github/workflows/_extension_distribution.yml` for the `duckdb-stable-build` job should be set to latest tagged release - -# API changes -DuckDB extensions built with this extension template are built against the internal C++ API of DuckDB. This API is not guaranteed to be stable. -What this means for extension development is that when updating your extensions DuckDB target version using the above steps, you may run into the fact that your extension no longer builds properly. - -Currently, DuckDB does not (yet) provide a specific change log for these API changes, but it is generally not too hard to figure out what has changed. - -For figuring out how and why the C++ API changed, we recommend using the following resources: -- DuckDB's [Release Notes](https://github.com/duckdb/duckdb/releases) -- DuckDB's history of [Core extension patches](https://github.com/duckdb/duckdb/commits/main/.github/patches/extensions) -- The git history of the relevant C++ Header file of the API that has changed \ No newline at end of file diff --git a/examples/.env.example b/examples/.env.example index dad1ac1..d54c5ab 100644 --- a/examples/.env.example +++ b/examples/.env.example @@ -6,7 +6,12 @@ SERVER_HOST=quacktail-server QUACK_PORT=9494 QUACK_FORWARD_LOCAL_PORT=19494 BUILD_FROM_SOURCE=1 +# Do not set BUILD_FROM_SOURCE=0 for DuckLake demo — v1.0.2 release lacks attach_ducklake. QUACKTAIL_RELEASE_TAG=v1.0.2 +QUACKTAIL_ENABLE_DUCKLAKE=1 +QUACKTAIL_REQUIRE_ATTACH_DUCKLAKE=1 +QUACKTAIL_LAKE_NAME=lake +QUACKTAIL_LAKE_DATA_PATH=/var/lib/ducklake/data # Headscale preauth key (generated at bootstrap — NOT the Quack token above): # docker compose exec -T quacktail-server cat /work/authkey diff --git a/examples/Dockerfile b/examples/Dockerfile index a2f10ab..95ca291 100644 --- a/examples/Dockerfile +++ b/examples/Dockerfile @@ -3,6 +3,7 @@ # # BUILD_FROM_SOURCE=1 (default in compose): build DuckDB + quackscale from this repo. # BUILD_FROM_SOURCE=0: pull pinned GitHub release binary (QUACKTAIL_RELEASE_TAG, default v1.0.2). +# Release v1.0.2 lacks attach_ducklake/tailscale_down — DuckLake demo requires source build. ARG BUILD_FROM_SOURCE=1 ARG GITHUB_REPO=quackscience/duckdb-quackscale @@ -16,24 +17,25 @@ ARG GITHUB_REPO ARG QUACKTAIL_RELEASE_TAG ENV DEBIAN_FRONTEND=noninteractive -# stage-1 always COPY --from=builder /out/ — create it even when downloading a release. RUN mkdir -p /out RUN if [ "$BUILD_FROM_SOURCE" = "1" ]; then \ apt-get update \ - && apt-get install -y --no-install-recommends bash ca-certificates curl git golang-go \ - cmake ninja-build g++ make python3 \ + && apt-get install -y --no-install-recommends bash ca-certificates curl git \ + build-essential cmake ninja-build patch python3 \ && rm -rf /var/lib/apt/lists/*; \ fi WORKDIR /src COPY . /src/ +RUN chmod +x /src/scripts/e2e/docker-build-quackscale.sh + RUN if [ "$BUILD_FROM_SOURCE" = "1" ]; then \ - git submodule update --init --recursive \ - && GEN=ninja make release \ - && install -m755 build/release/duckdb /out/duckdb \ - && cp -a build/release/extension/quackscale /out/quackscale-ext 2>/dev/null || true; \ + /src/scripts/e2e/docker-build-quackscale.sh /out; \ + else \ + echo "build_from_source=0" > /out/build-info \ + && echo "release_tag=${QUACKTAIL_RELEASE_TAG}" >> /out/build-info; \ fi FROM ubuntu:24.04 @@ -73,18 +75,33 @@ ENV DUCKDB_BIN=/usr/local/bin/duckdb ENV DUCKDB_EXTENSION_DIRECTORY=/duckdb_extensions ENV QUACK_PORT=9494 -RUN mkdir -p /duckdb_extensions \ +# Install quack + ducklake first, then lay down our quackscale artifact (must not be overwritten). +RUN mkdir -p /duckdb_extensions /etc/quacktail \ + && duckdb :memory: -batch -c "SET extension_directory='/duckdb_extensions'; INSTALL quack FROM core; LOAD quack; SELECT 1;" \ + || duckdb :memory: -batch -c "SET extension_directory='/duckdb_extensions'; INSTALL quack FROM core_nightly; LOAD quack; SELECT 1;" \ + && duckdb :memory: -batch -c "SET extension_directory='/duckdb_extensions'; INSTALL ducklake FROM core; LOAD ducklake; SELECT 1;" \ + || duckdb :memory: -batch -c "SET extension_directory='/duckdb_extensions'; INSTALL ducklake FROM core_nightly; LOAD ducklake; SELECT 1;" \ && if [ -d /opt/quacktail-build/quackscale-ext ]; then \ - cp -a /opt/quacktail-build/quackscale-ext/. /duckdb_extensions/; \ + install -d /duckdb_extensions/quackscale; \ + install -m644 /opt/quacktail-build/quackscale-ext/quackscale.duckdb_extension \ + /duckdb_extensions/quackscale/quackscale.duckdb_extension; \ fi \ - && duckdb :memory: -batch -c "SET extension_directory='/duckdb_extensions'; INSTALL quack FROM core; LOAD quack; SELECT 1;" \ - || duckdb :memory: -batch -c "SET extension_directory='/duckdb_extensions'; INSTALL quack FROM core_nightly; LOAD quack; SELECT 1;" + && if [ -f /opt/quacktail-build/build-info ]; then \ + cp /opt/quacktail-build/build-info /etc/quacktail/build-info; \ + fi \ + && if [ -f /opt/quacktail-build/git-rev ]; then \ + cp /opt/quacktail-build/git-rev /etc/quacktail/git-rev; \ + fi + +COPY scripts/lib/quacktail_ext.sh /usr/local/lib/quacktail_ext.sh +COPY scripts/e2e/quacktail-verify-image.sh /usr/local/bin/quacktail-verify-image.sh +RUN chmod +x /usr/local/bin/quacktail-verify-image.sh \ + && /usr/local/bin/quacktail-verify-image.sh COPY scripts/e2e/quacktail-entrypoint.sh /usr/local/bin/quacktail-entrypoint.sh COPY scripts/e2e/quacktail-compose-bootstrap.sh /usr/local/bin/quacktail-compose-bootstrap.sh COPY scripts/e2e/quacktail-server-run.sh /usr/local/bin/quacktail-server-run.sh COPY scripts/e2e/quacktail-server-healthcheck.sh /usr/local/bin/quacktail-server-healthcheck.sh -COPY scripts/lib/quacktail_ext.sh /usr/local/lib/quacktail_ext.sh RUN chmod +x /usr/local/bin/quacktail-entrypoint.sh /usr/local/bin/quacktail-compose-bootstrap.sh \ /usr/local/bin/quacktail-server-run.sh /usr/local/bin/quacktail-server-healthcheck.sh diff --git a/examples/README.md b/examples/README.md index ba70fe2..5c97753 100644 --- a/examples/README.md +++ b/examples/README.md @@ -1,6 +1,8 @@ # QuackTail Docker Compose example -Two-node **Headscale + QuackTail** demo on Linux: a long-lived **server** DuckDB joins the tailnet and serves Quack on port 9494; a one-shot **client** joins the same tailnet, forwards Quack HTTP through tsnet, and `ATTACH`es the remote database. +Two-node **Headscale + QuackTail** demo on Linux: server joins the tailnet and serves Quack; client `ATTACH`es via `tailscale_quack_forward`. + +**Integration guide:** [docs/GUIDE.md](../docs/GUIDE.md) · **DuckLake demo:** [ducklake/README.md](ducklake/README.md) **Requires:** Linux, Docker Compose v2, `/dev/net/tun`, outbound HTTPS. @@ -27,15 +29,29 @@ Quack HTTP uses **kernel TCP**. Embedded tsnet does not route that traffic. `tai ## Run the demo +Source build is required for the DuckLake demo (`attach_ducklake`, `tailscale_down`). + ```bash -git pull && cd examples -docker compose build quacktail-server quacktail-client +git pull +git submodule update --init --recursive +cd examples +docker compose build --no-cache quacktail-server quacktail-client +docker compose run --rm --entrypoint /usr/local/bin/quacktail-verify-image.sh quacktail-client docker compose up -d --force-recreate headscale quacktail-server docker compose --profile test run --rm quacktail-client ``` Use **`--force-recreate`** on the server after script or SQL changes (otherwise the old DuckDB process keeps running). +**Refresh stale `/work` SQL without running the client demo** (one container, no DuckDB session): + +```bash +docker compose run --rm -e QUACKTAIL_ROLE=bootstrap quacktail-client +docker compose --profile test run --rm quacktail-client +``` + +Do **not** use `quacktail-client true` — compose sets `QUACKTAIL_ROLE=client`, so that still runs the full demo. + **Release binary instead of source build:** ```bash @@ -55,26 +71,29 @@ Expect: ```text → waiting for quacktail-server on tailnet ... ✓ quacktail-server on tailnet -✓ client SQL ready — attach quack:127.0.0.1:19494 QuackTail cluster demo ====================== -→ join tailnet, tailscale_ping quacktail-server:9494, quack_query, ATTACH quack:127.0.0.1:19494 ... - -CALL tailscale_up(...); → running true -CALL tailscale_quack_forward(...); → active true, quack:127.0.0.1:19494 -CALL tailscale_ping(...); → reachable true -FROM quack_query(...); → probe 1 +→ join tailnet, forward, attach_ducklake, ATTACH quack:127.0.0.1:19494 ... + +CALL tailscale_up(...); → running true +CALL tailscale_quack_forward(...); → quack:127.0.0.1:19494 +CALL tailscale_ping(...); → reachable true +FROM quack_query(...); → probe 1 +CALL attach_ducklake(...); → lake.inventory view created +SELECT * FROM lake.inventory ...; +SELECT 'LAKE_PASSED' ...; ATTACH 'quack:127.0.0.1:19494' AS remote (TYPE quack); -SELECT * FROM remote.e2e_payload LIMIT 5; SELECT 'PASSED' ...; +SELECT 'CLIENT_DEMO_DONE' ...; +CALL tailscale_down(); -✓ Demo passed — two-node QuackTail cluster is working +✓ Demo passed — QuackTail cluster + DuckLake over tailnet ``` The client runs one DuckDB session (`duckdb -batch -echo -f /work/client_session.sql`). Compose waits for `quacktail-server` **healthy** (server.log shows `quack_serve` + `tailscale_serve_local`) before starting the client. -Set `QUACKTAIL_QUIET=0` to print full SQL. libtailscale detail: `/work/client-tsnet.log` (client), `/work/server.log` (server). +Set `QUACKTAIL_QUIET=0` to print full SQL. Server libtailscale logs: `/work/server.log`. ## Services @@ -159,4 +178,4 @@ docker compose --profile test run --rm quacktail-client **Client logs:** `docker compose exec quacktail-server cat /work/client.out` (last run, shared volume) -See also [docs/AUTHENTICATION.md](../docs/AUTHENTICATION.md) (Tailscale + forwarder) and [docs/QUACK_AUTH.md](../docs/QUACK_AUTH.md) (Quack tokens). +See also [docs/GUIDE.md](../docs/GUIDE.md) (integration patterns) and [docs/AUTHENTICATION.md](../docs/AUTHENTICATION.md) (credentials). diff --git a/examples/docker-compose.yml b/examples/docker-compose.yml index 70f20d2..7feb915 100644 --- a/examples/docker-compose.yml +++ b/examples/docker-compose.yml @@ -19,8 +19,8 @@ x-env: &env QUACK_FORWARD_LOCAL_PORT: "19494" QUACK_TAILNET_TOKEN: ${QUACK_TAILNET_TOKEN:-quackscale-demo-token} QUACKTAIL_QUIET: "1" - QUACKTAIL_DEMO_TIMEOUT_SEC: "90" - QUACKTAIL_CLIENT_ATTEMPTS: "15" + QUACKTAIL_DEMO_TIMEOUT_SEC: "60" + QUACKTAIL_CLIENT_ATTEMPTS: "3" QUACKTAIL_CLIENT_POLL_SEC: "2" QUACKTAIL_WAIT_ATTEMPTS: "15" QUACKTAIL_WAIT_POLL_SEC: "1" @@ -29,6 +29,11 @@ x-env: &env SERVER_HOST: quacktail-server CLIENT_HOST: quacktail-client DUCKDB_EXTENSION_DIRECTORY: /duckdb_extensions + QUACKTAIL_ENABLE_DUCKLAKE: ${QUACKTAIL_ENABLE_DUCKLAKE:-1} + QUACKTAIL_REQUIRE_ATTACH_DUCKLAKE: ${QUACKTAIL_REQUIRE_ATTACH_DUCKLAKE:-1} + QUACKTAIL_LAKE_NAME: ${QUACKTAIL_LAKE_NAME:-lake} + QUACKTAIL_LAKE_METADATA: ${QUACKTAIL_LAKE_METADATA:-/var/lib/ducklake/metadata/inventory.ducklake} + QUACKTAIL_LAKE_DATA_PATH: ${QUACKTAIL_LAKE_DATA_PATH:-/var/lib/ducklake/data} x-headscale-volumes: &headscale_volumes configs: @@ -48,6 +53,7 @@ x-quacktail: &quacktail BUILD_FROM_SOURCE: ${BUILD_FROM_SOURCE:-1} GITHUB_REPO: ${GITHUB_REPO:-quackscience/duckdb-quackscale} QUACKTAIL_RELEASE_TAG: ${QUACKTAIL_RELEASE_TAG:-v1.0.2} + # DuckLake demo needs attach_ducklake — do not set BUILD_FROM_SOURCE=0 until a release includes it. cap_add: [NET_ADMIN] devices: [/dev/net/tun] <<: *headscale_volumes @@ -131,11 +137,17 @@ services: depends_on: headscale: condition: service_healthy + volumes: + - quacktail-work:/work + - ducklake-lake:/var/lib/ducklake + - headscale-data:/var/lib/headscale + - headscale-run:/var/run/headscale environment: <<: *env QUACKTAIL_WORK: /work QUACKTAIL_ROLE: server QUACKTAIL_AUTO_BOOTSTRAP: "1" + QUACKTAIL_MANAGE_CLIENT_SQL: "0" HEADSCALE_CONFIG: /etc/headscale/config.yaml healthcheck: test: [CMD, /usr/local/bin/quacktail-server-healthcheck.sh] @@ -158,6 +170,7 @@ services: <<: *env QUACKTAIL_WORK: /work QUACKTAIL_ROLE: client + QUACKTAIL_MANAGE_CLIENT_SQL: "1" QUACKTAIL_WAIT_SERVER: quacktail-server HEADSCALE_CONFIG: /etc/headscale/config.yaml restart: "no" @@ -185,6 +198,7 @@ volumes: headscale-data: headscale-run: quacktail-work: + ducklake-lake: networks: quacktail: diff --git a/examples/ducklake/README.md b/examples/ducklake/README.md new file mode 100644 index 0000000..e6d06c8 --- /dev/null +++ b/examples/ducklake/README.md @@ -0,0 +1,62 @@ +# DuckLake + Quack on QuackTail + +The compose demo on branch **`ducklake`**: the server attaches a local DuckLake catalog, seeds `inventory`, and exposes it on the tailnet. The client queries the lake via **`attach_ducklake`** (preferred) or `quack_query`. + +## Architecture + +```text +quacktail-server quacktail-client +───────────────── ───────────────── +tailscale_up tailscale_up +ATTACH ducklake:… AS lake (local Parquet) tailscale_quack_forward → quack:127.0.0.1:19494 + └─ ducklake-lake volume attach_ducklake → SELECT FROM lake.inventory +quack_serve(127.0.0.1:9494) ATTACH quack:… AS remote (e2e) +tailscale_serve_local +``` + +Parquet + metadata live on **`ducklake-lake`** on the server only (`/var/lib/ducklake`). + +## Access patterns + +| Pattern | When to use | +|---------|-------------| +| **`CALL attach_ducklake(...)`** | Server owns DuckLake files — **preferred** | +| **`quack_query(uri, '…')`** | Same as above; fallback for older images | +| **`tailscale_quack_forward`** | Required on tsnet clients before Quack ATTACH | +| **`ATTACH 'ducklake:quack:…' (DATA_PATH '…')`** | Client has shared Parquet ([DuckDB 1.5.3](https://duckdb.org/2026/05/20/announcing-duckdb-153.html)) | +| **`ATTACH 'quack:…' AS remote`** | Primary catalog only — **not** `remote.lake.*` | + +Full pattern guide: [docs/GUIDE.md](../docs/GUIDE.md). + +## Run the demo + +```bash +cd examples +docker compose build quacktail-server quacktail-client +docker compose up -d --force-recreate headscale quacktail-server +docker compose --profile test run --rm quacktail-client +``` + +Expect `LAKE_PASSED`, `PASSED`, and `✓ Demo passed`. + +## Tailnet client SQL (sketch) + +```sql +CALL tailscale_quack_forward(host => 'quacktail-server', port => 9494, local_port => 19494); + +CREATE SECRET (TYPE quack, TOKEN 'quackscale-demo-token', SCOPE 'quack:127.0.0.1:19494'); + +CALL attach_ducklake( + 'quack:127.0.0.1:19494', + remote_catalog => 'lake', + alias => 'lake', + token => 'quackscale-demo-token', + disable_ssl => true +); +SELECT * FROM lake.inventory; + +ATTACH 'quack:127.0.0.1:19494' AS remote (TYPE quack); +SELECT * FROM remote.e2e_payload; +``` + +See [local-demo.sql](local-demo.sql) for a standalone script. diff --git a/examples/ducklake/local-demo.sql b/examples/ducklake/local-demo.sql new file mode 100644 index 0000000..1f3fef9 --- /dev/null +++ b/examples/ducklake/local-demo.sql @@ -0,0 +1,40 @@ +-- DuckLake + Quack on one host (no tailnet). Requires DuckDB 1.5+ with quack + ducklake from core. +-- +-- Server session (terminal 1): +-- duckdb server.duckdb +-- +-- Client session (terminal 2): +-- duckdb + +-- === Server === +INSTALL quack FROM core; +INSTALL ducklake FROM core; +LOAD quack; +LOAD ducklake; + +ATTACH 'ducklake:./lake/metadata/inventory.ducklake' AS lake (DATA_PATH './lake/data/'); +USE lake; + +CREATE TABLE IF NOT EXISTS inventory (item_id INT, quantity INT); +INSERT INTO inventory VALUES (101, 50), (102, 120); + +CALL quack_serve( + 'quack:127.0.0.1:9494', + allow_other_hostname => true, + token => 'quackscale-demo-token' +); + +-- === Client (new duckdb process) === +-- INSTALL quack FROM core; +-- INSTALL ducklake FROM core; +-- LOAD quack; +-- LOAD ducklake; +-- +-- CREATE SECRET (TYPE quack, TOKEN 'quackscale-demo-token', SCOPE 'quack:127.0.0.1:9494'); +-- +-- Option A: query lake tables via DuckLake-over-Quack (catalog over Quack, Parquet via DATA_PATH) +-- ATTACH 'quack:127.0.0.1:9494' AS remote (TYPE quack); +-- ATTACH 'ducklake:quack:127.0.0.1:9494' AS lake (DATA_PATH './lake/data/'); +-- SELECT * FROM lake.inventory; +-- +-- Do NOT use remote.lake.inventory — plain quack attach does not expose nested DuckLake catalogs. diff --git a/scripts/ci_compose_e2e.sh b/scripts/ci_compose_e2e.sh new file mode 100755 index 0000000..4cd89c0 --- /dev/null +++ b/scripts/ci_compose_e2e.sh @@ -0,0 +1,30 @@ +#!/usr/bin/env bash +# Local/dev only: full compose e2e with SOURCE-built images (examples/docker-compose.yml). +# CI e2e uses release binaries — see .github/workflows/headscale-e2e.yml (workflow_dispatch). +set -euo pipefail + +ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +EXAMPLES="$ROOT/examples" +LOG="${CI_COMPOSE_E2E_LOG:-${RUNNER_TEMP:-/tmp}/quacktail-compose-e2e.log}" + +cd "$EXAMPLES" + +echo "=== docker compose build (BUILD_FROM_SOURCE=1 — local dev only) ===" +docker compose build quacktail-server quacktail-client + +echo "=== verify image ===" +docker compose run --rm --entrypoint /usr/local/bin/quacktail-verify-image.sh quacktail-client + +echo "=== start headscale + quacktail-server ===" +docker compose up -d --force-recreate headscale quacktail-server + +echo "=== run quacktail-client (profile test) ===" +: >"$LOG" +docker compose --profile test run --rm quacktail-client 2>&1 | tee "$LOG" + +grep -q 'LAKE_PASSED' "$LOG" || { echo "error: LAKE_PASSED missing" >&2; exit 1; } +grep -q 'PASSED' "$LOG" || { echo "error: PASSED missing" >&2; exit 1; } +grep -qE 'Demo passed|CLIENT_DEMO_DONE' "$LOG" || { echo "error: demo completion marker missing" >&2; exit 1; } +grep -q 'attach_ducklake' "$LOG" || { echo "error: attach_ducklake path not used" >&2; exit 1; } + +echo "ok: compose e2e passed (source build)" diff --git a/scripts/ci_headscale_e2e.sh b/scripts/ci_headscale_e2e.sh index 3f167d4..246342d 100755 --- a/scripts/ci_headscale_e2e.sh +++ b/scripts/ci_headscale_e2e.sh @@ -1,6 +1,6 @@ #!/usr/bin/env bash -# Two-node QuackTail e2e: Headscale + server + client DuckDB containers overlap. -# Server stays up (-d); client starts while server is still booting; client polls then ATTACH. +# CI e2e: release duckdb bind-mounted into minimal containers (no DuckDB compile). +# For source-built compose demo locally, use scripts/ci_compose_e2e.sh instead. set -euo pipefail ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" diff --git a/scripts/e2e/docker-build-quackscale.sh b/scripts/e2e/docker-build-quackscale.sh new file mode 100755 index 0000000..10353a0 --- /dev/null +++ b/scripts/e2e/docker-build-quackscale.sh @@ -0,0 +1,53 @@ +#!/usr/bin/env bash +# Builder-stage script for examples/Dockerfile (BUILD_FROM_SOURCE=1). +set -euo pipefail + +OUT="${1:-/out}" +mkdir -p "$OUT" + +need_submodules() { + [[ -f duckdb/CMakeLists.txt ]] \ + && [[ -f extension-ci-tools/makefiles/duckdb_extension.Makefile ]] \ + && [[ -f third_party/libtailscale/go.mod || -f third_party/libtailscale/README.md ]] \ + && return 1 + return 0 +} + +if need_submodules; then + echo "→ initializing git submodules ..." + if [[ -d .git ]]; then + git submodule sync --recursive + git submodule update --init --recursive + else + echo "error: git submodules missing and .git not available in build context" >&2 + echo " clone with: git clone --recurse-submodules …" >&2 + echo " or ensure duckdb/ and extension-ci-tools/ are populated before docker build" >&2 + exit 1 + fi +else + echo "→ submodules present in build context (skipping git submodule update)" +fi + +# Never reuse host build trees (especially dangerous when building linux from macOS). +rm -rf build .cache + +echo "→ make release (GEN=ninja, $(nproc) jobs) ..." +GEN=ninja make release -j"$(nproc)" + +EXT_ART="build/release/extension/quackscale/quackscale.duckdb_extension" +if [[ ! -f "$EXT_ART" ]]; then + echo "error: quackscale loadable extension not found at ${EXT_ART}" >&2 + ls -la build/release/extension/quackscale 2>/dev/null >&2 || true + exit 1 +fi + +install -m755 build/release/duckdb "$OUT/duckdb" +cp -a build/release/extension/quackscale "$OUT/quackscale-ext" + +if [[ -d .git ]]; then + git rev-parse HEAD > "$OUT/git-rev" +else + echo "docker-build" > "$OUT/git-rev" +fi +echo "build_from_source=1" > "$OUT/build-info" +echo "✓ quackscale builder done" diff --git a/scripts/e2e/quacktail-compose-bootstrap.sh b/scripts/e2e/quacktail-compose-bootstrap.sh index 5533a07..2051b18 100644 --- a/scripts/e2e/quacktail-compose-bootstrap.sh +++ b/scripts/e2e/quacktail-compose-bootstrap.sh @@ -3,11 +3,24 @@ set -euo pipefail WORK="${QUACKTAIL_WORK:-/work}" +DUCKDB_BIN="${DUCKDB_BIN:-/usr/local/bin/duckdb}" + +# shellcheck source=/dev/null +source /usr/local/lib/quacktail_ext.sh + +duckdb_has_quackscale_function() { + quacktail_has_quackscale_function "$@" +} + SERVER_HOST="${SERVER_HOST:-quacktail-server}" CLIENT_HOST="${CLIENT_HOST:-quacktail-client}" QUACK_PORT="${QUACK_PORT:-9494}" QUACK_FORWARD_LOCAL_PORT="${QUACK_FORWARD_LOCAL_PORT:-19494}" QUACK_TOKEN="${QUACK_TAILNET_TOKEN:-quackscale-demo-token}" +LAKE_NAME="${QUACKTAIL_LAKE_NAME:-lake}" +LAKE_METADATA="${QUACKTAIL_LAKE_METADATA:-/var/lib/ducklake/metadata/inventory.ducklake}" +LAKE_DATA_PATH="${QUACKTAIL_LAKE_DATA_PATH:-/var/lib/ducklake/data}" +ENABLE_DUCKLAKE="${QUACKTAIL_ENABLE_DUCKLAKE:-1}" CONTROL_URL="${HEADSCALE_CONTROL_URL:-http://headscale:8080}" HS_USER="${HEADSCALE_USER:-quackscale-demo}" HS_CFG="${HEADSCALE_CONFIG:-/etc/headscale/config.yaml}" @@ -24,6 +37,16 @@ mkdir -p "$WORK" CLIENT_STATE_DIR="/tmp/client-tailscale" ATTACH_URI="quack:${SERVER_HOST}:${QUACK_PORT}" +client_sql_lake_mode() { + if [[ "$ENABLE_DUCKLAKE" != "1" ]]; then + echo "off" + elif duckdb_has_quackscale_function attach_ducklake; then + echo "attach_ducklake" + else + echo "quack_query" + fi +} + resolve_server_tailnet_ip() { "${HS[@]}" nodes list 2>/dev/null | grep -F "$SERVER_HOST" | grep -oE '100\.64\.[0-9]+\.[0-9]+' | head -1 || true } @@ -33,26 +56,14 @@ resolve_attach_uri() { } resolve_client_attach_uri() { - local ext_dir="/duckdb_extensions" - if command -v duckdb >/dev/null 2>&1 \ - && duckdb :memory: -batch -csv -noheader -c \ - "SET extension_directory='${ext_dir}'; LOAD quackscale; SELECT COUNT(*) FROM duckdb_functions() WHERE function_name='tailscale_quack_forward';" \ - 2>/dev/null | grep -qx '1'; then + local ext_dir="${DUCKDB_EXTENSION_DIRECTORY:-/duckdb_extensions}" + if duckdb_has_quackscale_function tailscale_quack_forward; then echo "quack:127.0.0.1:${QUACK_FORWARD_LOCAL_PORT}" else resolve_attach_uri fi } -duckdb_has_quackscale_function() { - local fn="$1" - local ext_dir="/duckdb_extensions" - command -v duckdb >/dev/null 2>&1 \ - && duckdb :memory: -batch -csv -noheader -c \ - "SET extension_directory='${ext_dir}'; LOAD quackscale; SELECT COUNT(*) FROM duckdb_functions() WHERE function_name='${fn}';" \ - 2>/dev/null | grep -qx '1' -} - compose_attach_is_local_forward() { case "$1" in quack:127.0.0.1:* | quack:localhost:* | quack:127.0.0.1 | quack:localhost) return 0 ;; @@ -74,6 +85,62 @@ SQL fi } +compose_sql_attach_ducklake() { + local attach_uri="${1:?attach uri required}" + local lake_name="${2:?lake name required}" + local data_path="${3:?data path required}" + cat < '${QUACK_TOKEN}', + disable_ssl => true +); +SQL +} + +write_server_ducklake_sql() { + [[ "$ENABLE_DUCKLAKE" == "1" ]] || return 0 + mkdir -p "$(dirname "$LAKE_METADATA")" "$LAKE_DATA_PATH" + if [[ -f "$LAKE_METADATA" ]]; then + cat >"$WORK/server_ducklake.sql" <"$WORK/server_ducklake.sql" <"$WORK/client_session.sql" < true ); +${lake_discover_sql} +${lake_attach_sql} +${lake_select} +${lake_passed_sql} $(compose_sql_attach_remote "$attach_uri") SELECT * FROM remote.e2e_payload LIMIT 5; - SELECT 'PASSED' AS status, '${attach_uri}' AS attach_uri, @@ -161,7 +250,17 @@ SELECT MAX(CASE WHEN source = 'client' THEN msg END) AS client_row, COUNT(*)::INTEGER AS total_rows FROM remote.e2e_payload; + +DETACH remote; + +SELECT 'CLIENT_DEMO_DONE' AS status; + +${teardown_sql} SQL + if grep -q '\\n' "$WORK/client_session.sql" 2>/dev/null; then + echo "error: generated client_session.sql contains literal \\n" >&2 + exit 1 + fi write_client_init_sql "$authkey" cp "$WORK/client_session.sql" "$WORK/client_demo.sql" } @@ -247,19 +346,47 @@ if [[ -f "$WORK/server_setup.sql" && -f "$WORK/authkey" ]]; then write_server_quack_sql echo "✓ server quack SQL ready — loopback + tailscale_serve_local(:${QUACK_PORT})" fi - if [[ "${COMPOSE_REFRESH_CLIENT_SQL:-}" == "1" ]] \ - || [[ ! -f "$WORK/client_session.sql" ]] \ - || [[ ! -f "$WORK/client_init.sql" ]] \ - || [[ -f "$WORK/client_demo.sql" && ! -f "$WORK/client_quack.sql" ]] \ - || { [[ -f "$WORK/client_quack.sql" ]] && grep -q 'NOT EXISTS' "$WORK/client_quack.sql"; } \ - || { [[ -f "$WORK/client_init.sql" ]] && ! grep -q "${CLIENT_STATE_DIR}" "$WORK/client_init.sql"; } \ - || { [[ -f "$WORK/client_quack.sql" ]] && grep -qE "quack:100\.64\." "$WORK/client_quack.sql"; } \ - || { [[ -f "$WORK/client_session.sql" ]] && ! grep -q 'tailscale_ping' "$WORK/client_session.sql"; } \ - || { [[ -f "$WORK/client_session.sql" ]] && ! grep -q 'quack_query' "$WORK/client_session.sql"; } \ - || { [[ -f "$WORK/client_session.sql" ]] && grep -q 'ON CONFLICT' "$WORK/client_session.sql"; } \ - || { [[ -f "$WORK/client_session.sql" ]] && ! grep -q 'tailscale_quack_proxy' "$WORK/client_session.sql"; }; then - refresh_client_sql "$AUTHKEY" - echo "✓ client SQL ready — attach ${ATTACH_URI}" + if [[ "${COMPOSE_REFRESH_SERVER_DUCKLAKE:-}" == "1" ]] \ + || { [[ "$ENABLE_DUCKLAKE" == "1" ]] && [[ ! -f "$WORK/server_ducklake.sql" ]]; } \ + || { [[ "$ENABLE_DUCKLAKE" == "1" ]] && [[ -f "$WORK/server_ducklake.sql" ]] \ + && ! grep -q 'quacktail: lake-' "$WORK/server_ducklake.sql" 2>/dev/null; }; then + write_server_ducklake_sql + echo "✓ server ducklake SQL ready — ${LAKE_NAME} @ ${LAKE_METADATA}" + fi + if [[ "${QUACKTAIL_MANAGE_CLIENT_SQL:-0}" == "1" ]]; then + if [[ "${COMPOSE_REFRESH_CLIENT_SQL:-}" == "1" ]] \ + || [[ ! -f "$WORK/client_session.sql" ]] \ + || [[ ! -f "$WORK/client_init.sql" ]] \ + || [[ -f "$WORK/client_demo.sql" && ! -f "$WORK/client_quack.sql" ]] \ + || { [[ -f "$WORK/client_quack.sql" ]] && grep -q 'NOT EXISTS' "$WORK/client_quack.sql"; } \ + || { [[ -f "$WORK/client_init.sql" ]] && ! grep -q "${CLIENT_STATE_DIR}" "$WORK/client_init.sql"; } \ + || { [[ -f "$WORK/client_quack.sql" ]] && grep -qE "quack:100\.64\." "$WORK/client_quack.sql"; } \ + || { [[ -f "$WORK/client_session.sql" ]] && ! grep -q 'tailscale_ping' "$WORK/client_session.sql"; } \ + || { [[ -f "$WORK/client_session.sql" ]] && ! grep -q 'quack_query' "$WORK/client_session.sql"; } \ + || { [[ -f "$WORK/client_session.sql" ]] && grep -q 'ON CONFLICT' "$WORK/client_session.sql"; } \ + || { [[ -f "$WORK/client_session.sql" ]] && ! grep -q 'tailscale_quack_forward' "$WORK/client_session.sql"; } \ + || { [[ -f "$WORK/client_session.sql" ]] && grep -q '\\n' "$WORK/client_session.sql"; } \ + || { [[ -f "$WORK/client_session.sql" ]] && grep -q 'CALL tailscale_down' "$WORK/client_session.sql" \ + && ! duckdb_has_quackscale_function tailscale_down; } \ + || { [[ -f "$WORK/client_session.sql" ]] && duckdb_has_quackscale_function tailscale_down \ + && ! grep -q 'CALL tailscale_down' "$WORK/client_session.sql"; } \ + || { [[ -f "$WORK/client_session.sql" ]] && ! grep -q 'CLIENT_DEMO_DONE' "$WORK/client_session.sql"; } \ + || { [[ -f "$WORK/client_session.sql" ]] && grep -q 'CALL tailscale_down' "$WORK/client_session.sql" \ + && grep -q 'CLIENT_DEMO_DONE' "$WORK/client_session.sql" \ + && [[ "$(grep -n 'CALL tailscale_down' "$WORK/client_session.sql" | head -1 | cut -d: -f1)" \ + -lt "$(grep -n "CLIENT_DEMO_DONE" "$WORK/client_session.sql" | head -1 | cut -d: -f1)" ]]; } \ + || { [[ "$ENABLE_DUCKLAKE" == "1" && -f "$WORK/client_session.sql" ]] && ! grep -q 'DISCOVERED' "$WORK/client_session.sql"; } \ + || { [[ "$ENABLE_DUCKLAKE" == "1" && -f "$WORK/client_session.sql" ]] && grep -q 'quacktail_attach_remote_lake' "$WORK/client_session.sql"; } \ + || { [[ "$ENABLE_DUCKLAKE" == "1" && -f "$WORK/client_session.sql" ]] && duckdb_has_quackscale_function attach_ducklake \ + && ! grep -q 'attach_ducklake' "$WORK/client_session.sql"; } \ + || { [[ "$ENABLE_DUCKLAKE" == "1" && -f "$WORK/client_session.sql" ]] && duckdb_has_quackscale_function attach_ducklake \ + && grep -q 'FROM quack_query' "$WORK/client_session.sql" \ + && grep -q "${LAKE_NAME}.inventory" "$WORK/client_session.sql" \ + && ! grep -q 'attach_ducklake' "$WORK/client_session.sql"; } \ + || { [[ "$ENABLE_DUCKLAKE" == "1" && -f "$WORK/client_session.sql" ]] && ! grep -q "${LAKE_NAME}.inventory" "$WORK/client_session.sql"; }; then + refresh_client_sql "$AUTHKEY" + echo "✓ client SQL ready — attach ${ATTACH_URI} (lake: $(client_sql_lake_mode))" + fi fi exit 0 fi @@ -361,9 +488,7 @@ fi if [[ -z "$AUTHKEY" ]]; then echo "error: failed to create Headscale authkey" >&2 - echo "headscale users list:" >&2 "${HS[@]}" users list >&2 || true - echo "headscale preauthkeys create (debug):" >&2 create_authkey >&2 || true exit 1 fi @@ -399,6 +524,8 @@ SQL write_server_quack_sql +write_server_ducklake_sql + refresh_client_sql "$AUTHKEY" echo "✓ Headscale authkey ready — attach URI ${ATTACH_URI}" diff --git a/scripts/e2e/quacktail-entrypoint.sh b/scripts/e2e/quacktail-entrypoint.sh index 60b07a1..549bb06 100755 --- a/scripts/e2e/quacktail-entrypoint.sh +++ b/scripts/e2e/quacktail-entrypoint.sh @@ -23,7 +23,7 @@ fi ensure_quack() { local ext_dir="${DUCKDB_EXTENSION_DIRECTORY:-$(quacktail_ext_container_dir)}" export DUCKDB_EXTENSION_DIRECTORY="$ext_dir" - quacktail_ci_ensure_quack "$DUCKDB" "$ext_dir" load_only + quacktail_ci_ensure_demo_extensions "$DUCKDB" "$ext_dir" load_only } quacktail_sql_extension_directory() { @@ -105,13 +105,20 @@ ensure_server_hosts_mapping() { run_server() { maybe_compose_bootstrap if [[ -f "${WORK}/authkey" ]] && [[ -x /usr/local/bin/quacktail-compose-bootstrap.sh ]]; then - COMPOSE_REFRESH_SERVER_QUACK=1 QUACKTAIL_AUTO_BOOTSTRAP=1 /usr/local/bin/quacktail-compose-bootstrap.sh + COMPOSE_REFRESH_SERVER_QUACK=1 COMPOSE_REFRESH_SERVER_DUCKLAKE=1 QUACKTAIL_AUTO_BOOTSTRAP=1 \ + /usr/local/bin/quacktail-compose-bootstrap.sh fi ensure_quack rm -f "${WORK}/quack_ready" - cat "${WORK}/server_setup.sql" "${WORK}/server_quack.sql" >"$INIT_SQL" + { + cat "${WORK}/server_setup.sql" + if [[ -f "${WORK}/server_ducklake.sql" ]]; then + cat "${WORK}/server_ducklake.sql" + fi + cat "${WORK}/server_quack.sql" + } >"$INIT_SQL" if [[ "$QUIET" == "1" ]]; then - echo "→ quacktail-server: join tailnet + quack_serve(127.0.0.1:${PORT}) + tailscale_serve_local" + echo "→ quacktail-server: tailnet + ducklake + quack_serve(127.0.0.1:${PORT}) + tailscale_serve_local" echo " (libtailscale logs → ${WORK}/server.log)" else echo "=== server init SQL ===" @@ -131,12 +138,17 @@ quacktail_filter_demo_stream() { ensure_client_sql() { if [[ -f "${WORK}/authkey" ]] && [[ -x /usr/local/bin/quacktail-compose-bootstrap.sh ]]; then - COMPOSE_REFRESH_CLIENT_SQL=1 QUACKTAIL_AUTO_BOOTSTRAP=1 /usr/local/bin/quacktail-compose-bootstrap.sh + COMPOSE_REFRESH_CLIENT_SQL=1 QUACKTAIL_MANAGE_CLIENT_SQL=1 QUACKTAIL_AUTO_BOOTSTRAP=1 \ + /usr/local/bin/quacktail-compose-bootstrap.sh fi if [[ ! -f "${WORK}/client_session.sql" ]]; then echo "error: ${WORK}/client_session.sql missing" >&2 exit 1 fi + if grep -q '\\n' "${WORK}/client_session.sql" 2>/dev/null; then + echo "error: ${WORK}/client_session.sql contains literal \\n (regenerate bootstrap)" >&2 + exit 1 + fi } client_attach_uri() { @@ -155,39 +167,112 @@ client_attach_uri() { quacktail_dump_client_failure() { local out="${WORK}/client.out" - local tsnet_log="${WORK}/client-tsnet.log" if [[ -s "$out" ]]; then echo "--- client.out (tail) ---" >&2 tail -30 "$out" >&2 fi - if [[ -s "$tsnet_log" ]]; then - echo "--- client-tsnet.log (tail) ---" >&2 - tail -30 "$tsnet_log" >&2 +} + +quacktail_is_signal_rc() { + case "${1:-0}" in + 130|143) return 0 ;; + esac + return 1 +} + +quacktail_client_on_signal() { + echo "Interrupted — stopping client demo" >&2 + exit 130 +} + +quacktail_client_has_fatal_sql_error() { + local out="${1:?client out file}" + grep -qE 'Parser Error:|Catalog Error:|Binder Error:|Syntax Error:' "$out" 2>/dev/null +} + +quacktail_client_session_succeeded() { + local out="${1:?client out file}" + grep -q "CLIENT_DEMO_DONE" "$out" 2>/dev/null || return 1 + grep -q "PASSED" "$out" 2>/dev/null || return 1 + if [[ "${QUACKTAIL_ENABLE_DUCKLAKE:-0}" == "1" ]]; then + grep -q "LAKE_PASSED" "$out" 2>/dev/null || return 1 fi + return 0 +} + +quacktail_stop_process() { + local pid="${1:?pid}" + local wait_ms="${2:-1500}" + local elapsed=0 + kill -0 "$pid" 2>/dev/null || return 0 + while (( elapsed < wait_ms )); do + kill -0 "$pid" 2>/dev/null || break + sleep 0.1 + elapsed=$((elapsed + 100)) + done + kill -0 "$pid" 2>/dev/null || { wait "$pid" 2>/dev/null || true; return 0; } + kill -TERM "$pid" 2>/dev/null || true + sleep 0.2 + kill -0 "$pid" 2>/dev/null || { wait "$pid" 2>/dev/null || true; return 0; } + kill -KILL "$pid" 2>/dev/null || true + wait "$pid" 2>/dev/null || true +} + +quacktail_show_client_demo_output() { + local out="${1:-${WORK}/client.out}" + [[ -s "$out" ]] || return 0 + quacktail_filter_demo_stream <"$out" } run_duckdb_client_session() { local session_sql="${1:?session sql file}" local out="${2:?out file}" local demo_timeout="${3:?timeout}" - local tsnet_log="${WORK}/client-tsnet.log" local ext_cmd duckdb_rc=0 + local timeout_cmd=(timeout --foreground --kill-after=3 "$demo_timeout") + local duck_pid=0 + local deadline=0 ext_cmd="$(quacktail_sql_extension_directory)" - : >"$tsnet_log" + : >"$out" - # Same invocation as scripts/local_remote_headscale_test.sh (-f, no -bail, no -init file db). + # Background duckdb → client.out; monitor client.out for CLIENT_DEMO_DONE then SIGTERM/KILL. + # CLIENT_DEMO_DONE is emitted before tailscale_down (tsnet close can block). set +o pipefail if [[ "$QUIET" == "1" ]]; then - timeout "$demo_timeout" stdbuf -oL -eL "$DUCKDB" -batch -echo \ + "${timeout_cmd[@]}" stdbuf -oL -eL "$DUCKDB" -batch -echo \ -cmd "$ext_cmd" -f "$session_sql" \ - 2>>"$tsnet_log" | quacktail_filter_demo_stream | tee "$out" + >"$out" 2>&1 & else - timeout "$demo_timeout" stdbuf -oL -eL "$DUCKDB" -batch -echo \ + "${timeout_cmd[@]}" stdbuf -oL -eL "$DUCKDB" -batch -echo \ -cmd "$ext_cmd" -f "$session_sql" \ - 2>&1 | quacktail_filter_demo_stream | tee "$out" + 2>&1 | tee "$out" & fi - duckdb_rc=$? + duck_pid=$! + deadline=$((SECONDS + demo_timeout + 5)) + + while kill -0 "$duck_pid" 2>/dev/null; do + if quacktail_client_session_succeeded "$out"; then + quacktail_stop_process "$duck_pid" 500 + set -o pipefail + return 0 + fi + if quacktail_client_has_fatal_sql_error "$out"; then + quacktail_stop_process "$duck_pid" 500 + set -o pipefail + return 1 + fi + if (( SECONDS >= deadline )); then + quacktail_stop_process "$duck_pid" 500 + set -o pipefail + echo "error: client demo timed out after ${demo_timeout}s" >&2 + quacktail_dump_client_failure + return 124 + fi + sleep 0.1 + done + + wait "$duck_pid" || duckdb_rc=$? set -o pipefail if [[ "$duckdb_rc" -eq 124 ]]; then @@ -195,21 +280,69 @@ run_duckdb_client_session() { quacktail_dump_client_failure return 124 fi + if quacktail_is_signal_rc "$duckdb_rc"; then + return "$duckdb_rc" + fi return "$duckdb_rc" } +run_bootstrap() { + if [[ ! -f "${WORK}/authkey" ]]; then + echo "error: ${WORK}/authkey missing — start headscale + quacktail-server first" >&2 + exit 1 + fi + if [[ "$QUIET" == "1" ]]; then + echo "→ refreshing /work SQL on volume (no client demo) ..." + fi + COMPOSE_REFRESH_CLIENT_SQL=1 COMPOSE_REFRESH_SERVER_QUACK=1 \ + QUACKTAIL_MANAGE_CLIENT_SQL=1 QUACKTAIL_AUTO_BOOTSTRAP=1 /usr/local/bin/quacktail-compose-bootstrap.sh + if [[ "$QUIET" == "1" ]]; then + echo "✓ bootstrap complete — run: docker compose --profile test run --rm quacktail-client" + else + echo "ok: bootstrap complete" + fi +} + +client_demo_banner() { + local session_sql="${1:?session sql}" + local attach_uri="${2:?attach uri}" + if [[ "${QUACKTAIL_ENABLE_DUCKLAKE:-0}" == "1" ]] \ + && grep -q 'attach_ducklake' "$session_sql" 2>/dev/null; then + echo "→ join tailnet, forward, attach_ducklake, ATTACH ${attach_uri} ..." + else + echo "→ join tailnet, tailscale_ping ${SERVER_HOST}:${PORT}, quack_query, ATTACH ${attach_uri} ..." + fi +} + +quacktail_require_attach_ducklake() { + [[ "${QUACKTAIL_REQUIRE_ATTACH_DUCKLAKE:-0}" == "1" ]] || return 0 + [[ "${QUACKTAIL_ENABLE_DUCKLAKE:-0}" == "1" ]] || return 0 + quacktail_has_quackscale_function attach_ducklake && return 0 + echo "error: attach_ducklake required but not in this image" >&2 + echo "Rebuild: cd examples && docker compose build --no-cache quacktail-client" >&2 + exit 1 +} + run_client() { local session_sql="${WORK}/client_session.sql" local out="${WORK}/client.out" - local demo_timeout="${QUACKTAIL_DEMO_TIMEOUT_SEC:-90}" - local max_attempts="${QUACKTAIL_CLIENT_ATTEMPTS:-15}" + local demo_timeout="${QUACKTAIL_DEMO_TIMEOUT_SEC:-60}" + local max_attempts="${QUACKTAIL_CLIENT_ATTEMPTS:-3}" local poll_sec="${QUACKTAIL_CLIENT_POLL_SEC:-2}" local attach_uri local duckdb_rc=0 local attempt + trap 'quacktail_client_on_signal INT' INT + trap 'quacktail_client_on_signal TERM' TERM + + if [[ "$QUIET" == "1" ]]; then + echo "→ preparing client (tailnet wait, extensions, session SQL) ..." + fi + wait_for_tailnet_server ensure_quack + quacktail_require_attach_ducklake ensure_server_hosts_mapping ensure_client_sql attach_uri="$(client_attach_uri)" @@ -222,7 +355,7 @@ run_client() { echo "" echo "QuackTail cluster demo" echo "======================" - echo "→ join tailnet, tailscale_ping ${SERVER_HOST}:${PORT}, quack_query, ATTACH ${attach_uri} ..." + client_demo_banner "$session_sql" "$attach_uri" echo "" else echo "=== client session SQL (-f) ===" @@ -233,37 +366,47 @@ run_client() { duckdb_rc=0 run_duckdb_client_session "$session_sql" "$out" "$demo_timeout" \ || duckdb_rc=$? - if [[ "$duckdb_rc" -eq 0 ]] && grep -q "PASSED" "$out" 2>/dev/null; then + if quacktail_is_signal_rc "$duckdb_rc"; then + exit "$duckdb_rc" + fi + if quacktail_client_has_fatal_sql_error "$out"; then + echo "error: non-retryable SQL failure in client session" >&2 + quacktail_dump_client_failure + exit 1 + fi + if quacktail_client_session_succeeded "$out"; then + duckdb_rc=0 break fi if (( attempt < max_attempts )); then [[ "$QUIET" == "1" ]] && echo "→ retry ${attempt}/${max_attempts} ..." quacktail_dump_client_failure - sleep "$poll_sec" + sleep "$poll_sec" || exit 130 fi done - if [[ "$duckdb_rc" -ne 0 ]]; then - echo "error: client demo failed (exit ${duckdb_rc})" >&2 - quacktail_dump_client_failure - exit 1 - fi - - if ! grep -q "PASSED" "$out" 2>/dev/null; then - echo "error: expected PASSED row missing after ${max_attempts} attempts" >&2 + if ! quacktail_client_session_succeeded "$out"; then + echo "error: client demo failed after ${max_attempts} attempt(s) (exit ${duckdb_rc})" >&2 quacktail_dump_client_failure exit 1 fi if [[ "$QUIET" == "1" ]]; then - echo "✓ Demo passed — two-node QuackTail cluster is working" + quacktail_show_client_demo_output "$out" + echo "" + if [[ "${QUACKTAIL_ENABLE_DUCKLAKE:-0}" == "1" ]]; then + echo "✓ Demo passed — QuackTail cluster + DuckLake over tailnet" + else + echo "✓ Demo passed — two-node QuackTail cluster is working" + fi else - echo "ok: client e2e passed (PASSED row present)" + echo "ok: client e2e passed (CLIENT_DEMO_DONE)" fi } case "$ROLE" in server) run_server ;; client) run_client ;; - *) echo "error: unknown QUACKTAIL_ROLE '$ROLE'" >&2; exit 1 ;; + bootstrap) run_bootstrap ;; + *) echo "error: unknown QUACKTAIL_ROLE '$ROLE' (use server, client, or bootstrap)" >&2; exit 1 ;; esac diff --git a/scripts/e2e/quacktail-verify-image.sh b/scripts/e2e/quacktail-verify-image.sh new file mode 100644 index 0000000..37a4b21 --- /dev/null +++ b/scripts/e2e/quacktail-verify-image.sh @@ -0,0 +1,27 @@ +#!/usr/bin/env bash +# Verify required quackscale functions are present in the container image. +set -euo pipefail + +DUCKDB_BIN="${DUCKDB_BIN:-/usr/local/bin/duckdb}" +EXT_DIR="${DUCKDB_EXTENSION_DIRECTORY:-/duckdb_extensions}" + +# shellcheck source=/dev/null +source /usr/local/lib/quacktail_ext.sh + +require_fn() { + local fn="$1" + if ! quacktail_has_quackscale_function "$fn"; then + echo "error: quackscale missing required function: ${fn}" >&2 + echo "extension_directory=${EXT_DIR}" >&2 + ls -la "$EXT_DIR" "$EXT_DIR/quackscale" 2>/dev/null >&2 || true + echo "registered quackscale functions:" >&2 + quacktail_list_quackscale_functions >&2 || true + [[ -f /etc/quacktail/build-info ]] && cat /etc/quacktail/build-info >&2 + exit 1 + fi +} + +require_fn attach_ducklake +require_fn tailscale_down +require_fn tailscale_quack_forward +echo "ok: quackscale image verify (attach_ducklake, tailscale_down, tailscale_quack_forward)" diff --git a/scripts/lib/quacktail_ext.sh b/scripts/lib/quacktail_ext.sh index 8aa7570..5206ac2 100755 --- a/scripts/lib/quacktail_ext.sh +++ b/scripts/lib/quacktail_ext.sh @@ -12,6 +12,38 @@ quacktail_ext_container_dir() { echo "${QUACKTAIL_CONTAINER_EXT_DIR:-/duckdb_extensions}" } +quacktail_has_quackscale_function() { + local fn="${1:?function name required}" + local duckdb_bin="${DUCKDB_BIN:-/usr/local/bin/duckdb}" + local ext_dir="${DUCKDB_EXTENSION_DIRECTORY:-$(quacktail_ext_container_dir)}" + local out count + [[ -x "$duckdb_bin" ]] || return 1 + out="$("$duckdb_bin" :memory: -batch -csv -noheader -c \ + "SET extension_directory='${ext_dir}'; LOAD quackscale; \ + SELECT CAST(COUNT(*) AS VARCHAR) FROM duckdb_functions() WHERE function_name='${fn}';" \ + 2>&1)" || true + count="$(printf '%s\n' "$out" | tail -1 | tr -d '[:space:]')" + [[ "$count" == "1" ]] && return 0 + out="$("$duckdb_bin" :memory: -batch -csv -noheader -c \ + "LOAD quackscale; SELECT CAST(COUNT(*) AS VARCHAR) FROM duckdb_functions() WHERE function_name='${fn}';" \ + 2>&1)" || true + count="$(printf '%s\n' "$out" | tail -1 | tr -d '[:space:]')" + [[ "$count" == "1" ]] +} + +quacktail_list_quackscale_functions() { + local duckdb_bin="${DUCKDB_BIN:-/usr/local/bin/duckdb}" + local ext_dir="${DUCKDB_EXTENSION_DIRECTORY:-$(quacktail_ext_container_dir)}" + [[ -x "$duckdb_bin" ]] || return 1 + "$duckdb_bin" :memory: -batch -csv -noheader -c \ + "SET extension_directory='${ext_dir}'; LOAD quackscale; \ + SELECT function_name FROM duckdb_functions() \ + WHERE function_name LIKE 'tailscale_%' \ + OR function_name LIKE 'attach_%' \ + OR function_name IN ('quack_uri', 'quack_token', 'quack_discover') \ + ORDER BY 1;" +} + quacktail_ext_verify_artifact() { local install_path="${1:?install path}" if [[ -f "$install_path" ]]; then @@ -74,6 +106,60 @@ quacktail_ci_ensure_quack() { "${set_ext} LOAD quack; SELECT extension_name, loaded, install_path FROM duckdb_extensions() WHERE extension_name='quack';" } +# Install/load ducklake (core, then core_nightly). +quacktail_ci_ensure_ducklake() { + local duckdb_bin="${1:?duckdb binary}" + local ext_dir="${2:-}" + local mode="${3:-install}" + + if [[ -z "$ext_dir" ]]; then + ext_dir="$(quacktail_ext_container_dir)" + fi + mkdir -p "$ext_dir" + + local set_ext + set_ext="$(quacktail_ext_sql_set "$ext_dir")" + + if [[ "$mode" == "load_only" ]]; then + if ! "$duckdb_bin" :memory: -batch -c "${set_ext} LOAD ducklake; SELECT 1;" >/dev/null; then + echo "error: ducklake not available at ${ext_dir}" >&2 + return 1 + fi + elif ! "$duckdb_bin" :memory: -batch -c "${set_ext} LOAD ducklake; SELECT 1;" >/dev/null; then + echo "Installing ducklake (core, then core_nightly) into ${ext_dir} ..." + if ! "$duckdb_bin" :memory: -batch -c "${set_ext} INSTALL ducklake FROM core; LOAD ducklake; SELECT 1;"; then + "$duckdb_bin" :memory: -batch -c "${set_ext} INSTALL ducklake FROM core_nightly; LOAD ducklake; SELECT 1;" + fi + fi + + if [[ "${QUACKTAIL_QUIET:-}" == "1" ]]; then + return 0 + fi + + echo "=== ducklake extension (${mode}) ===" + "$duckdb_bin" :memory: -batch -echo -c \ + "${set_ext} LOAD ducklake; SELECT extension_name, loaded, install_path FROM duckdb_extensions() WHERE extension_name='ducklake';" +} + +quacktail_ci_ensure_demo_extensions() { + local duckdb_bin="${1:?duckdb binary}" + local ext_dir="${2:-}" + local mode="${3:-install}" + quacktail_ci_ensure_quack "$duckdb_bin" "$ext_dir" "$mode" + if [[ "${QUACKTAIL_ENABLE_DUCKLAKE:-1}" == "1" ]]; then + quacktail_ci_ensure_ducklake "$duckdb_bin" "$ext_dir" "$mode" + fi +} + +quacktail_ext_sql_load_demo() { + local ext_dir="${1:?extension directory required}" + echo "$(quacktail_ext_sql_set "$ext_dir")" + echo "LOAD quack;" + if [[ "${QUACKTAIL_ENABLE_DUCKLAKE:-1}" == "1" ]]; then + echo "LOAD ducklake;" + fi +} + # Server init finished: explicit marker and/or quack_serve + tailscale_serve_local output in server.log. quacktail_server_log_ready() { local log="${1:?server.log path required}" diff --git a/src/attach_ducklake.cpp b/src/attach_ducklake.cpp new file mode 100644 index 0000000..971a179 --- /dev/null +++ b/src/attach_ducklake.cpp @@ -0,0 +1,180 @@ +#include "attach_ducklake.hpp" + +#include "duckdb/common/exception.hpp" +#include "duckdb/common/string_util.hpp" +#include "duckdb/function/table_function.hpp" +#include "duckdb/main/client_context.hpp" +#include "duckdb/main/connection.hpp" +#include "duckdb/main/materialized_query_result.hpp" + +#include + +namespace duckdb { + +namespace { + +static constexpr const char *kIdentPattern = "^[A-Za-z_][A-Za-z0-9_]*$"; + +static void ValidateIdentifier(const string &name, const char *label) { + std::regex re(kIdentPattern); + if (!std::regex_match(name, re)) { + throw InvalidInputException("%s must match %s (got '%s')", label, kIdentPattern, name); + } +} + +static string EscapeSqlString(const string &value) { + return StringUtil::Replace(value, "'", "''"); +} + +static string BuildQuackQueryFromClause(const string &quack_uri, const string &remote_sql, const string &token, + bool disable_ssl) { + string sql = "FROM quack_query('" + EscapeSqlString(quack_uri) + "', '" + EscapeSqlString(remote_sql) + "'"; + if (!token.empty()) { + sql += ", token => '" + EscapeSqlString(token) + "'"; + } + if (disable_ssl) { + sql += ", disable_ssl => true"; + } + sql += ")"; + return sql; +} + +static void EnsureQuackLoaded(Connection &conn) { + auto result = conn.Query("SELECT COUNT(*) FROM duckdb_functions() WHERE function_name = 'quack_query'"); + if (result->HasError()) { + throw InvalidInputException("attach_ducklake requires LOAD quack; %s", result->GetError()); + } + auto count = result->GetValue(0, 0).GetValue(); + if (count == 0) { + throw InvalidInputException("attach_ducklake requires LOAD quack (quack_query not registered)"); + } +} + +static void RunStatement(Connection &conn, const string &sql) { + auto result = conn.Query(sql); + if (result->HasError()) { + throw InvalidInputException("attach_ducklake failed: %s\nStatement: %s", result->GetError(), sql); + } +} + +struct RemoteLakeAttachBindData : public TableFunctionData { + string quack_uri; + string remote_catalog; + string alias; + string token; + bool disable_ssl = true; + bool finished = false; + vector created_views; +}; + +static unique_ptr RemoteLakeAttachBind(ClientContext &context, TableFunctionBindInput &input, + vector &return_types, vector &names) { + if (input.inputs.empty() || input.inputs[0].IsNull()) { + throw InvalidInputException("attach_ducklake requires quack_uri"); + } + + auto bind = make_uniq(); + bind->quack_uri = input.inputs[0].GetValue(); + + auto catalog_it = input.named_parameters.find("remote_catalog"); + if (catalog_it != input.named_parameters.end()) { + bind->remote_catalog = catalog_it->second.GetValue(); + } else { + bind->remote_catalog = "lake"; + } + auto alias_it = input.named_parameters.find("alias"); + if (alias_it != input.named_parameters.end()) { + bind->alias = alias_it->second.GetValue(); + } else { + bind->alias = bind->remote_catalog; + } + auto token_it = input.named_parameters.find("token"); + if (token_it != input.named_parameters.end() && !token_it->second.IsNull()) { + bind->token = token_it->second.GetValue(); + } + auto ssl_it = input.named_parameters.find("disable_ssl"); + if (ssl_it != input.named_parameters.end()) { + bind->disable_ssl = ssl_it->second.GetValue(); + } + + ValidateIdentifier(bind->remote_catalog, "remote_catalog"); + ValidateIdentifier(bind->alias, "alias"); + + return_types = {LogicalType::VARCHAR, LogicalType::VARCHAR, LogicalType::VARCHAR}; + names = {"local_view", "remote_table", "status"}; + return std::move(bind); +} + +static void RemoteLakeAttachFunction(ClientContext &context, TableFunctionInput &data_p, DataChunk &output) { + auto &bind = data_p.bind_data->CastNoConst(); + if (bind.finished) { + return; + } + + Connection conn(*context.db); + EnsureQuackLoaded(conn); + + RunStatement(conn, "CREATE SCHEMA IF NOT EXISTS " + bind.alias); + + const string list_sql = StringUtil::Format( + "SELECT table_name FROM duckdb_tables() WHERE database_name = '%s' ORDER BY table_name", + EscapeSqlString(bind.remote_catalog)); + const auto list_from = BuildQuackQueryFromClause(bind.quack_uri, list_sql, bind.token, bind.disable_ssl); + + auto tables = conn.Query(list_from); + if (tables->HasError()) { + throw InvalidInputException("attach_ducklake: could not list remote tables: %s", + tables->GetError()); + } + + idx_t row_count = 0; + for (idx_t row = 0; row < tables->RowCount(); row++) { + auto table_name = tables->GetValue(0, row).ToString(); + if (table_name.empty()) { + continue; + } + ValidateIdentifier(table_name, "remote table name"); + + const string remote_select = + StringUtil::Format("SELECT * FROM %s.%s", bind.remote_catalog, table_name); + const string view_sql = StringUtil::Format( + "CREATE OR REPLACE VIEW %s.%s AS %s", bind.alias, table_name, + BuildQuackQueryFromClause(bind.quack_uri, remote_select, bind.token, bind.disable_ssl)); + + RunStatement(conn, view_sql); + bind.created_views.push_back(bind.alias + "." + table_name); + row_count++; + } + + if (row_count == 0) { + throw InvalidInputException( + "attach_ducklake: no tables found in remote catalog '%s' (is DuckLake attached on the server?)", + bind.remote_catalog); + } + + output.SetCardinality(row_count); + for (idx_t row = 0; row < row_count; row++) { + const auto &view_name = bind.created_views[row]; + const auto dot = view_name.find('.'); + const string table_only = dot == string::npos ? view_name : view_name.substr(dot + 1); + output.SetValue(0, row, Value(view_name)); + output.SetValue(1, row, Value(StringUtil::Format("%s.%s", bind.remote_catalog, table_only))); + output.SetValue(2, row, Value("created")); + } + + bind.finished = true; +} + +} // namespace + +void RegisterAttachDucklakeFunctions(ExtensionLoader &loader) { + TableFunction attach("attach_ducklake", {LogicalType::VARCHAR}, RemoteLakeAttachFunction, + RemoteLakeAttachBind); + attach.named_parameters["remote_catalog"] = LogicalType::VARCHAR; + attach.named_parameters["alias"] = LogicalType::VARCHAR; + attach.named_parameters["token"] = LogicalType::VARCHAR; + attach.named_parameters["disable_ssl"] = LogicalType::BOOLEAN; + loader.RegisterFunction(attach); +} + +} // namespace duckdb diff --git a/src/include/attach_ducklake.hpp b/src/include/attach_ducklake.hpp new file mode 100644 index 0000000..8281719 --- /dev/null +++ b/src/include/attach_ducklake.hpp @@ -0,0 +1,9 @@ +#pragma once + +#include "duckdb/main/extension/extension_loader.hpp" + +namespace duckdb { + +void RegisterAttachDucklakeFunctions(ExtensionLoader &loader); + +} // namespace duckdb diff --git a/src/include/tailscale_bridge.hpp b/src/include/tailscale_bridge.hpp index 4a061b0..367982c 100644 --- a/src/include/tailscale_bridge.hpp +++ b/src/include/tailscale_bridge.hpp @@ -97,6 +97,7 @@ class TailscaleBridge { void RefreshIPs(); string LastErrorMessage() const; void JoinLoginThread(); + void DetachLoginThread(); string ResolveAuthKey(const string &authkey) const; void MaybeStartLoopbackProxy(bool enable); void StartLoopbackProxy(); diff --git a/src/quackscale_extension.cpp b/src/quackscale_extension.cpp index f9003b7..711c0ee 100644 --- a/src/quackscale_extension.cpp +++ b/src/quackscale_extension.cpp @@ -2,6 +2,7 @@ #include "quackscale_extension.hpp" #include "quackscale_defaults.hpp" +#include "attach_ducklake.hpp" #include "tailscale_bridge.hpp" #include "duckdb.hpp" @@ -130,6 +131,28 @@ static void QuackscaleStatusFunction(ClientContext &context, TableFunctionInput bind.finished = true; } +struct QuackscaleDownBindData : public TableFunctionData { + bool finished = false; +}; + +static unique_ptr QuackscaleDownBind(ClientContext &context, TableFunctionBindInput &input, + vector &return_types, vector &names) { + return_types = {LogicalType::BOOLEAN}; + names = {"shutdown_ok"}; + return make_uniq(); +} + +static void QuackscaleDownFunction(ClientContext &context, TableFunctionInput &data_p, DataChunk &output) { + auto &bind = data_p.bind_data->CastNoConst(); + if (bind.finished) { + return; + } + TailscaleBridge::Get().Shutdown(); + output.SetCardinality(1); + output.SetValue(0, 0, Value::BOOLEAN(true)); + bind.finished = true; +} + static void QuackscaleQuackUriFunction(DataChunk &args, ExpressionState &state, Vector &result) { auto uri = TailscaleBridge::Get().QuackListenURI(QUACKSCALE_DEFAULT_QUACK_PORT); result.Reference(Value(uri)); @@ -460,6 +483,9 @@ static void LoadInternal(ExtensionLoader &loader) { RegisterAuthParameters(up_function); loader.RegisterFunction(up_function); + TableFunction down_function("tailscale_down", {}, QuackscaleDownFunction, QuackscaleDownBind); + loader.RegisterFunction(down_function); + TableFunction login_function("tailscale_login", {}, QuackscaleBeginLoginFunction, QuackscaleBeginLoginBind); RegisterAuthParameters(login_function); @@ -501,6 +527,8 @@ static void LoadInternal(ExtensionLoader &loader) { loader.RegisterFunction(ScalarFunction("quack_uri", {}, LogicalType::VARCHAR, QuackscaleQuackUriFunction)); loader.RegisterFunction(ScalarFunction("quack_token", {}, LogicalType::VARCHAR, QuackTokenFunction)); + + RegisterAttachDucklakeFunctions(loader); } } // namespace diff --git a/src/tailscale_bridge.cpp b/src/tailscale_bridge.cpp index 470b8c5..551dc8c 100644 --- a/src/tailscale_bridge.cpp +++ b/src/tailscale_bridge.cpp @@ -137,6 +137,12 @@ void TailscaleBridge::JoinLoginThread() { } } +void TailscaleBridge::DetachLoginThread() { + if (login_thread.joinable()) { + login_thread.detach(); + } +} + TailscaleStatus TailscaleBridge::Status() const { TailscaleStatus status; status.linked = @@ -386,18 +392,21 @@ TailscaleLoginStatus TailscaleBridge::LoginStatus() const { void TailscaleBridge::Shutdown() { std::lock_guard guard(g_tailscale_mutex); log_capture.Stop(); - JoinLoginThread(); + // Do not join — interactive login or tsnet teardown can block indefinitely. + DetachLoginThread(); forwarder.Stop(); ClearProxyEnvironment(); #ifdef QUACKSCALE_WITH_TAILSCALE - if (handle >= 0) { - tailscale_clear_serve(handle); - tailscale_close(handle); - handle = -1; + int closing = handle; + handle = -1; + running = false; + if (closing >= 0) { + tailscale_clear_serve(closing); + // tailscale_close waits for AuthLoop; detach so CALL tailscale_down() returns. + std::thread([closing]() { tailscale_close(closing); }).detach(); } #else #endif - running = false; ips.clear(); login_state = "idle"; login_message.clear(); diff --git a/test/e2e/README.md b/test/e2e/README.md index 10925db..83d2b56 100644 --- a/test/e2e/README.md +++ b/test/e2e/README.md @@ -2,55 +2,72 @@ Integration tests for a two-node QuackTail cluster over [Headscale](https://github.com/juanfont/headscale). -## Where tests live +## CI e2e (release binary, manual only) -| Test | How to run | -|------|------------| -| **Docker Compose demo** | [examples/README.md](../../examples/README.md) — `docker compose --profile test run --rm quacktail-client` | -| **GitHub Actions e2e** | [`.github/workflows/headscale-e2e.yml`](../../.github/workflows/headscale-e2e.yml) — manual `workflow_dispatch` | -| **Host script** | [`scripts/ci_headscale_e2e.sh`](../../scripts/ci_headscale_e2e.sh) — concurrent server + client containers | -| **Local host DuckDB** | [`scripts/local_remote_headscale_test.sh`](../../scripts/local_remote_headscale_test.sh) — join a running compose stack from the host | +GitHub Actions: [`.github/workflows/headscale-e2e.yml`](../../.github/workflows/headscale-e2e.yml) -All paths share the same client SQL shape (see [`scripts/lib/headscale_ci.sh`](../../scripts/lib/headscale_ci.sh) `headscale_ci_sql_client_session` and [`scripts/e2e/quacktail-compose-bootstrap.sh`](../../scripts/e2e/quacktail-compose-bootstrap.sh) `write_client_session_sql`). +- **Trigger:** `workflow_dispatch` only (never on push/PR) +- **DuckDB:** pre-built from a [GitHub release](https://github.com/quackscience/duckdb-quackscale/releases) via `scripts/ci_download_release_duckdb.sh` (default tag `v1.0.2`, or `latest`) +- **Runner:** `scripts/ci_headscale_e2e.sh` — Headscale + concurrent server/client containers with the release `duckdb` bind-mounted (minimal `test/e2e/Dockerfile.quacktail`, **no compile in CI**) -## Server (`loopback_serve`) +```bash +# Local equivalent (after downloading a release binary): +export DUCKDB=/path/to/release/duckdb +chmod +x scripts/ci_headscale_e2e.sh +./scripts/ci_headscale_e2e.sh +``` -Quack binds loopback; `tailscale_serve_local` publishes port 9494 on the tailnet: +Expect `PASSED`, client `insert-from-client`, and server `seed-from-server` in client logs. -```sql -CALL quack_serve('quack:127.0.0.1:9494', allow_other_hostname => true, token => quack_token()); -CALL tailscale_serve_local(port => 9494); +Release binaries may not include newer SQL helpers (`attach_ducklake`, `tailscale_down`) — the release e2e validates **Quack over tailnet** (`tailscale_quack_forward`, `ATTACH`, DML), not the full DuckLake compose demo. + +## Local compose e2e (source build — not CI) + +For the full DuckLake + `attach_ducklake` demo (builds DuckDB in Docker): + +```bash +git submodule update --init --recursive +chmod +x scripts/ci_compose_e2e.sh +./scripts/ci_compose_e2e.sh ``` -Healthcheck: `/work/server.log` contains `quack:127.0.0.1:9494` and `local_forward`. +Same as [examples/README.md](../../examples/README.md). Use this on a dev machine; **do not** wire it to push/PR workflows. + +## PR / push CI (not e2e) -## Client (one DuckDB session, no curl) +| Workflow | Trigger | Builds DuckDB? | +|----------|---------|----------------| +| [headscale-integration.yml](../../.github/workflows/headscale-integration.yml) | PR | Yes — smoke test only | +| [libtailscale-integration.yml](../../.github/workflows/libtailscale-integration.yml) | PR | Go tests | +| [MainDistributionPipeline.yml](../../.github/workflows/MainDistributionPipeline.yml) | PR / release | Extension CI | + +## Client session (release e2e) + +Generated by [`scripts/lib/headscale_ci.sh`](../../scripts/lib/headscale_ci.sh) `headscale_ci_sql_client_session`: ```sql LOAD quackscale; CALL tailscale_up(...); CALL tailscale_quack_forward(host => 'quacktail-server', port => 9494, local_port => 19494); -CALL tailscale_ping(host => 'quacktail-server', port => 9494); -LOAD quack; -CREATE SECRET (TYPE quack, TOKEN '…', SCOPE 'quack:127.0.0.1:19494'); -FROM quack_query('quack:127.0.0.1:19494', 'SELECT 1 AS probe', ...); +CALL tailscale_ping(...); +LOAD quack; CREATE SECRET ...; +FROM quack_query(..., 'SELECT 1 AS probe', ...); ATTACH 'quack:127.0.0.1:19494' AS remote (TYPE quack); -SELECT * FROM remote.e2e_payload LIMIT 5; -SELECT 'PASSED' AS status, ... FROM remote.e2e_payload; +INSERT INTO remote.e2e_payload ...; +SELECT 'PASSED' ... FROM remote.e2e_payload; ``` -Invoked as: `duckdb -batch -echo -f client_session.sql` (in-memory; no `-bail` / `-init` file DB). +## Compose demo client session (source / local) -Compose waits for `quacktail-server` **healthy** before starting the client. The client retries the full session until a `PASSED` row appears. +See [`scripts/e2e/quacktail-compose-bootstrap.sh`](../../scripts/e2e/quacktail-compose-bootstrap.sh) — adds DuckLake, `attach_ducklake`, `CLIENT_DEMO_DONE`, `tailscale_down`. -## CI workflow - -[`headscale-e2e.yml`](../../.github/workflows/headscale-e2e.yml): +## Server (`loopback_serve`) -1. Download release binary (`v1.0.2` by default) -2. Start Headscale in Docker -3. Run `scripts/ci_headscale_e2e.sh` (server container + client container) +```sql +CALL quack_serve('quack:127.0.0.1:9494', allow_other_hostname => true, token => quack_token()); +CALL tailscale_serve_local(port => 9494); +``` ## Debug probe -[`examples/docker-compose.yml`](../../examples/docker-compose.yml) profile `debug`: vanilla `tailscale/tailscale` container — isolates tailnet connectivity from DuckDB tsnet. +[examples/docker-compose.yml](../../examples/docker-compose.yml) profile `debug`: vanilla `tailscale/tailscale` container. diff --git a/test/sql/quackscale.test b/test/sql/quackscale.test index fad793b..5d1e775 100644 --- a/test/sql/quackscale.test +++ b/test/sql/quackscale.test @@ -51,3 +51,8 @@ statement error SELECT quack_token(); ---- quack_token(): set QUACK_TAILNET_TOKEN + +statement error +CALL attach_ducklake('quack:127.0.0.1:19494'); +---- +attach_ducklake requires LOAD quack