Skip to content

flyteorg/flyte-pyodide-test

Repository files navigation

flyte-sdk in Pyodide — local repro harness

Reproduce and iterate on pip install flyte (and import flyte) failures in Pyodide, locally and headlessly via Node — no browser playground needed.

Prerequisites

make test runs the local flyte-sdk wheel and requires the async-only-local-execution changes that are not on PyPI yet. You must check out the SDK branch next to this repo first:

# sibling of this repo (so ../flyte-sdk resolves)
git clone git@github.com:flyteorg/flyte-sdk.git
cd flyte-sdk
git checkout experimental/pyodide-async-local-executor

Layout expected by the Makefile:

src/
├── flyte-pyodide-test/   # this repo
└── flyte-sdk/            # on branch experimental/pyodide-async-local-executor

Override the location with SDK_DIR=/path/to/flyte-sdk if you keep it elsewhere. The import-only / repro / diagnose / probe targets use PyPI flyte and do not need the branch — only make test does.

Usage (make)

make install      # one-time: install the pyodide npm package
make test         # build ../flyte-sdk wheel, install in Pyodide, RUN a flyte example, print output
make import-only  # import-only smoke test against PyPI flyte (define env+task, no execution)
make repro        # reproduce the raw micropip.install("flyte") failure
make diagnose     # list ALL unresolvable native deps in one pass
make probe        # confirm threading is the only remaining import wall

SDK_DIR overrides the flyte-sdk location for test (default ../flyte-sdk).

make test requires the experimental/pyodide-async-local-executor SDK changes (async-only local execution under emscripten), so it always builds and uses the LOCAL wheel — those runtime changes are not on PyPI yet.

Running your own workflow

make test runs whatever example you point EXAMPLE at (default examples/hello.py):

make test EXAMPLE=examples/my_workflow.py

Write any flyte workflow in that file. Because it runs in async-only mode, use top-level await and the .aio() API and print(...) your output — see examples/hello.py for the convention. You can also call the runner directly: node run-example.mjs <flyte-wheel> <example.py>.

Scripts (what each make target runs)

script purpose
node run-example.mjs <wheel> <example.py> install the local wheel + execute any flyte example (top-level await), print its output
node repro.mjs reproduce the original micropip.install("flyte") failure
node diagnose.mjs list ALL unresolvable native deps in one pass (keep_going=True)
node workaround.mjs install via deps=False and report where import flyte fails
node probe-threads.mjs stub threading and confirm threading is the only remaining import wall
node solution.mjs [wheel] import-only workaround; imports flyte + defines a task. Optional arg: path to a local wheel to install instead of PyPI

Findings

There are two independent layers of incompatibility. obstore is only the first symptom of layer 1.

Layer 1 — install / dependency resolution

micropip resolves the whole Requires-Dist tree from wheel metadata before importing anything. These flyte deps have no pure-Python and no WASM wheel:

  • obstore>=0.7.3 (Rust) — flyte direct dep
  • pyqwest>=0.5.1 (Rust) — via connectrpc
  • cryptography>=49 (Rust) — via pyOpenSSL; Pyodide ships an older cryptography, so the pin can't be met
  • google-re2 (C++) — via flyteidl2

Bypass: install flyte, connectrpc, flyteidl2 with deps=False; install the rest normally. pyOpenSSL/keyring are not needed to import flyte.

Layer 2 — import time (was the wall; now FIXED on the branch)

import flyte used to fail because flyte.syncify started a daemon background event-loop thread at import time (RuntimeError: can't start new thread — Pyodide has no threads). The experimental/pyodide-async-local-executor branch fixes this in the SDK: flyte._utils.runtime_env.background_loop_disabled() auto-detects emscripten and skips the thread, putting flyte in async-only mode where .aio() runs coroutines directly in the caller's running loop.

It also makes the eager obstore imports optional (in storage/_storage.py, storage/_parallel_reader.py, and io/_dataframe/dataframe.py — the last guard was added as part of wiring up make test, since the type engine imports the dataframe module for every task). With those, async local execution works with no threading stub and no env var — see hello.mjs.

What works vs what doesn't (async-only mode on the branch)

  • import flyte, define TaskEnvironment/@env.task, and run async tasks locally via await flyte.run.aio(...) (then await run.outputs.aio()).
  • ❌ The blocking sync API (flyte.run(...) without .aio()) — raises a clear error; sync tasks, conditions, cloud-URI IO, and DataFrame types are unsupported (native deps absent). Install still needs the deps=False layering for the native-only wheels (Layer 1).

Testing the LOCAL repo build (what make test does)

The runtime changes are local-only, so make test builds and uses the local wheel:

  1. In $(SDK_DIR) (default ../flyte-sdk), make dist builds dist/flyte-*-py3-none-any.whl.
  2. run-example.mjs mounts that wheel into Pyodide's FS and micropip.install("emfs:/<wheel>", deps=False).
  3. It then executes your EXAMPLE .py file (default examples/hello.py), which runs a @env.task and prints the output.

Notes

  • "RuntimeError: loop ... is not the running loop" lines are harmless Pyodide-in-Node async-cleanup noise; ignore them if the script prints OK.
  • Pyodide version pulled here: v314 (Python 3.14). Threads can be enabled in Pyodide only with cross-origin isolation + SharedArrayBuffer (and even then support is partial) — out of scope for this runtime-only workaround.
  • The syncify no-thread-aware fix is now implemented on the experimental/pyodide-async-local-executor branch (see Layer 2 above). A fuller long-term fix would also move the native deps (obstore, pyqwest, google-re2) into an optional extra so Layer 1's deps=False dance isn't needed.

About

a test repo, to test flyte-sdk works on pyodide

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors