Skip to content

Add ggsql WebAssembly library#126

Merged
georgestagg merged 9 commits intomainfrom
ggsql-wasm
Feb 18, 2026
Merged

Add ggsql WebAssembly library#126
georgestagg merged 9 commits intomainfrom
ggsql-wasm

Conversation

@georgestagg
Copy link
Collaborator

@georgestagg georgestagg commented Feb 17, 2026

This PR adds a WebAssembly library in ggsql-wasm, with bindings for ggqsl. When built using wasm-pack, this produces an npm package that can be used to load ggsql in a web browser.

A GitHub Actions script has been added to build the Wasm binary, but this serves mostly just for testing. The result is not deployed anywhere yet. A demo website and Quarto extension using the Wasm binary will come in a future PR.

Some features are disabled in the Wasm build. Notably, we disable building duckdb for the moment. We also add a feature "builtin-data" and turn it off in Wasm to avoid bundling the ggsql sample datasets. In a future PR I'd like to add a feature where such data is fetched from the web on demand, this is a TODO item for now.

For now, we'll rely on Polars SQL in Wasm, disabling some problematic features of Polars so that it can be compiled there. Polars SQL does not support everything duckdb does, so some SQL queries have been re-written with this in mind. I am mostly confident of these changes, but the boxplot queries were rewritten entirely by Claude and should be looked at extra carefully.

Part of the above changes make it so that sample data is added under the name e.g. __ggsql__data__penguins__, because Polars does not like the :. Queries are correspondingly re-written before execution to handle this mapping.

In some places, like when materialising CTEs, we want to add internal temporary tables. In Polars this needs to be done by registration, rather than executing DDL SQL queries. So, in some cases a CREATE TEMP TABLE has been replaced by the reader register() mechanism. To support overwriting existing tables, an argument replace has been added to register(). In the case of the duckdb reader, this extra indirection should not have much of an effect and just eventually lead to similar SQL queries actually being executed, but the indirection is required for Polars SQL.

Finally, a minimal Wasm C sysroot has been added to the tree-sitter bindings for use with Wasm. These files have been taken directly from the tree-sitter source tree, and are MIT licensed, so we should be OK to use it verbatim. My only change to the sources is to increate the maximum heap size for our Wasm malloc.

EDIT: @cpsievert With this change, register() becomes a strictly required method for Readers and Python custom readers. It is required for ggsql to work correctly, because we need to create temporary tables during execution. As such, I have removed the supports_register boolean and updated the Python bindings accordingly. It's probably worth taking a quick look before we merge.

Copy link
Collaborator

@thomasp85 thomasp85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT this LGTM

I like the gradual refactoring of the reader side we have seen

@georgestagg georgestagg merged commit 871a2b1 into main Feb 18, 2026
30 checks passed
@georgestagg georgestagg deleted the ggsql-wasm branch February 18, 2026 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants