Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -388,6 +388,43 @@ jobs:
exit 1
fi

gfql-pyodide-browser:
needs: changes
if: ${{ needs.changes.outputs.docs == 'true' || needs.changes.outputs.gfql == 'true' || needs.changes.outputs.infra == 'true' || github.event_name == 'workflow_dispatch' || github.event_name == 'schedule' }}
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false

- name: Set up Python 3.12
uses: actions/setup-python@v5
with:
python-version: "3.12"

- name: Set up Node.js 20
uses: actions/setup-node@v4
with:
node-version: "20"

- name: Install uv
run: python -m pip install "uv==0.11.3"

- name: Install browser test dependencies
run: npm install --prefix demos/gfql/pyodide --no-audit --no-fund

- name: Install Chromium
run: npm exec --prefix demos/gfql/pyodide -- playwright install --with-deps chromium

- name: Build GFQL Pyodide CDN bundle
run: node demos/gfql/pyodide/build-bundle.mjs /tmp/pygraphistry-gfql-pyodide-browser --flavor cdn

- name: Browser smoke
env:
GFQL_BROWSER_SCREENSHOT: /tmp/gfql-pyodide-browser.png
run: node /tmp/pygraphistry-gfql-pyodide-browser/test-browser.mjs /tmp/pygraphistry-gfql-pyodide-browser

check-spark-lockfile:
needs: changes
if: ${{ needs.changes.outputs.spark == 'true' || needs.changes.outputs.infra == 'true' || github.event_name == 'workflow_dispatch' || github.event_name == 'schedule' }}
Expand Down
2 changes: 2 additions & 0 deletions .readthedocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ build:
os: ubuntu-22.04
tools:
python: "3.12"
nodejs: "20"
apt_packages:
# System dependencies - now works because we use jobs instead of commands
# More closely mirror https://github.com/sphinx-doc/sphinx-docker-images
Expand Down Expand Up @@ -42,6 +43,7 @@ build:
- cp DEVELOP.md docs/source/DEVELOP.md
build:
html:
- node demos/gfql/pyodide/build-bundle.mjs --docs-static --flavor cdn
- sphinx-build -b html -d docs/doctrees docs/source $READTHEDOCS_OUTPUT/html/
epub:
- sphinx-build -b epub -d docs/doctrees docs/source docs/_build/epub
Expand Down
90 changes: 90 additions & 0 deletions demos/gfql/pyodide/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Pyodide GFQL proof

This is a small `gfql.js` proof for running PyGraphistry GFQL inside Pyodide.

It uses:

- Pyodide `pandas`, `requests`, `packaging`, and `typing-extensions` packages.
- `micropip` for the pure Python `lark` runtime dependency used by the Cypher parser.
- A pure Python wheel for this repo, installed into Pyodide with `deps=False` after the runtime deps are already present.

For a browser URL wheel, `gfql.js` uses `micropip.install(url, deps=False)`. For a Node byte-mounted local wheel, it writes the wheel into Pyodide FS and extracts it into `purelib`; Pyodide/Node `fetch` does not resolve Pyodide FS paths as URLs.

Build a wheel from a writable copy of the repo:

```bash
rm -rf /tmp/pygraphistry-pyodide-src /tmp/pygraphistry-pyodide-dist
rsync -a --exclude .git --exclude plans --exclude uv.lock --exclude '=2' ./ /tmp/pygraphistry-pyodide-src/
uv run --no-project --with build python -m build --wheel --outdir /tmp/pygraphistry-pyodide-dist /tmp/pygraphistry-pyodide-src
```

Run the Node smoke proof:

```bash
rm -rf /tmp/pygraphistry-pyodide-node
npm install --prefix /tmp/pygraphistry-pyodide-node pyodide@314.0.0
PYODIDE_MODULE=/tmp/pygraphistry-pyodide-node/node_modules/pyodide/pyodide.mjs node demos/gfql/pyodide/run-node.mjs /tmp/pygraphistry-pyodide-dist/graphistry-0+unknown-py3-none-any.whl
```

The smoke uses `edges.csv` and validates both:

- AST GFQL: `e(edge_match={"weight": ge(2)})`, returning two filtered edges.
- Cypher parser path: `MATCH (a)-[e]->(b) WHERE e.weight >= 2 RETURN e`, returning two projected rows.

Both paths bind a small `id` nodes table derived from the CSV endpoints before running GFQL. That avoids pandas 3.0.2 concat edge cases in the current Pyodide runtime when Graphistry has to synthesize nodes.

Build a static Pyodide 314 bundle:

```bash
node demos/gfql/pyodide/build-bundle.mjs /tmp/pygraphistry-gfql-pyodide-bundle
```

The builder supports two flavors:

- `self-hosted`: copies Pyodide, Python stdlib, and required Pyodide wheels into
the bundle. This is the most reproducible/offline option and is about 22 MiB.
- `cdn`: keeps only the demo files plus Graphistry/`lark` wheels, and loads the
pinned Pyodide 314 runtime/packages from jsDelivr. This is the smallest hosted
artifact and is about 1 MiB, but first cold load still downloads Pyodide and
pandas from the CDN.

```bash
node demos/gfql/pyodide/build-bundle.mjs /tmp/gfql-cdn --flavor cdn
node demos/gfql/pyodide/build-bundle.mjs /tmp/gfql-self-hosted --flavor self-hosted
```

To generate the Read the Docs "Try it live" payload before a Sphinx HTML build:

```bash
node demos/gfql/pyodide/build-bundle.mjs --docs-static --flavor cdn
```

That writes the bundle to `docs/source/static/gfql/pyodide/`, which is ignored
by git because it contains generated docs artifacts and local wheels.

The bundle includes `gfql.js`, `browser.html`, `edges.csv`, `manifest.json`,
`size-report.json`, and wheels under `wheels/`. The `self-hosted` flavor also
includes `pyodide/`. Serve it with:

```bash
cd /tmp/pygraphistry-gfql-pyodide-bundle
python -m http.server 8000
```

Then open `http://localhost:8000/browser.html`.

Run the browser smoke:

```bash
npm install --prefix demos/gfql/pyodide --no-audit --no-fund
npm exec --prefix demos/gfql/pyodide -- playwright install chromium
node demos/gfql/pyodide/test-browser.mjs /tmp/pygraphistry-gfql-pyodide-bundle
```

Benchmark it:

```bash
GFQL_BENCH_SIZES=10,1000,10000 GFQL_BENCH_REPEAT=3 \
node /tmp/pygraphistry-gfql-pyodide-bundle/benchmark-node.mjs \
/tmp/pygraphistry-gfql-pyodide-bundle
```
134 changes: 134 additions & 0 deletions demos/gfql/pyodide/benchmark-node.mjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
import { readFile } from "node:fs/promises";
import { join, resolve } from "node:path";
import { performance } from "node:perf_hooks";
import { createGFQLRuntime } from "./gfql.js";

const bundleDir = resolve(process.argv[2] || "/tmp/pygraphistry-gfql-pyodide-bundle");
const sizes = (process.env.GFQL_BENCH_SIZES || "10,1000,10000")
.split(",")
.map((value) => Number(value.trim()))
.filter((value) => Number.isFinite(value) && value > 0);
const repeat = Number(process.env.GFQL_BENCH_REPEAT || "3");

function generateCsv(edgeCount) {
const lines = ["src,dst,weight"];
for (let i = 0; i < edgeCount; i += 1) {
lines.push(`n${i},n${i + 1},${i % 5}`);
}
return `${lines.join("\n")}\n`;
}

function median(values) {
const sorted = [...values].sort((a, b) => a - b);
return sorted[Math.floor(sorted.length / 2)];
}

async function timed(fn) {
const start = performance.now();
const value = await fn();
return { value, ms: performance.now() - start };
}

function markdownTable(report) {
const lines = [
"| edges | AST GFQL median ms | Cypher median ms | returned rows |",
"| ---: | ---: | ---: | ---: |",
];
for (const row of report.queries) {
lines.push(
`| ${row.edges} | ${row.astMedianMs.toFixed(1)} | ${row.cypherMedianMs.toFixed(1)} | ${row.rows} |`,
);
}
return lines.join("\n");
}

async function main() {
const manifest = JSON.parse(await readFile(join(bundleDir, "manifest.json"), "utf8"));
const sizeReport = JSON.parse(await readFile(join(bundleDir, "size-report.json"), "utf8"));
const wheelPath = join(bundleDir, manifest.graphistryWheel.replace("./", ""));
const wheelData = new Uint8Array(await readFile(wheelPath));
const pyodideModule = process.env.PYODIDE_MODULE || join(bundleDir, "pyodide/pyodide.mjs");
if (/^https?:\/\//.test(pyodideModule)) {
throw new Error("benchmark-node.mjs needs a local Pyodide module. Build with --flavor self-hosted or set PYODIDE_MODULE to a local pyodide.mjs.");
}
const requirements = await Promise.all(manifest.requirements.map(async (requirement) => {
if (!requirement.startsWith("./")) {
return requirement;
}
const path = join(bundleDir, requirement.replace("./", ""));
return {
path: `/tmp/${path.split("/").pop()}`,
data: new Uint8Array(await readFile(path)),
};
}));

const importResult = await timed(() => import(pyodideModule));
const runtimeResult = await timed(() => createGFQLRuntime({
loadPyodide: importResult.value.loadPyodide,
indexURL: manifest.indexURL.startsWith("./")
? `${join(bundleDir, manifest.indexURL.replace("./", ""))}/`
: manifest.indexURL,
packageBaseUrl: manifest.packageBaseUrl && manifest.packageBaseUrl.startsWith("./")
? `${join(bundleDir, manifest.packageBaseUrl.replace("./", ""))}/`
: manifest.packageBaseUrl,
pyodidePackages: manifest.pyodidePackages,
requirements,
graphistryWheel: {
path: `/tmp/${wheelPath.split("/").pop()}`,
data: wheelData,
},
}));
const runtime = runtimeResult.value;

const warmCsv = generateCsv(10);
await runtime.runEdgeWeightAtLeast({ csv: warmCsv, minWeight: 3 });
await runtime.runCypherCsv({
csv: warmCsv,
query: "MATCH (a)-[e]->(b) WHERE e.weight >= 3 RETURN e",
});

const queries = [];
for (const edgeCount of sizes) {
const csv = generateCsv(edgeCount);
const astTimes = [];
const cypherTimes = [];
let rows = 0;
for (let i = 0; i < repeat; i += 1) {
const ast = await timed(() => runtime.runEdgeWeightAtLeast({ csv, minWeight: 3 }));
const cypher = await timed(() => runtime.runCypherCsv({
csv,
query: "MATCH (a)-[e]->(b) WHERE e.weight >= 3 RETURN e",
}));
astTimes.push(ast.ms);
cypherTimes.push(cypher.ms);
rows = ast.value.edges.length;
}
queries.push({
edges: edgeCount,
rows,
astMedianMs: median(astTimes),
cypherMedianMs: median(cypherTimes),
astMs: astTimes,
cypherMs: cypherTimes,
});
}

const report = {
pyodideVersion: manifest.pyodideVersion,
bundleBytes: sizeReport.totalBytes,
pyodideBytes: sizeReport.pyodideBytes,
wheelsBytes: sizeReport.wheelsBytes,
importPyodideModuleMs: importResult.ms,
createRuntimeMs: runtimeResult.ms,
repeat,
queries,
};

console.log(JSON.stringify(report, null, 2));
console.log("\n" + markdownTable(report));
}

main().catch((error) => {
console.error(error);
process.exitCode = 1;
});
Loading
Loading