feat: WASM language runtime + memory leak fixes#4
Open
HexaField wants to merge 27 commits into
Open
Conversation
HexaField
pushed a commit
that referenced
this pull request
Feb 23, 2026
…#652) * Surreal files per perspective wip 1 * Avoid duplicate links w/ unique index and handle lock and write errors * Fix new remove_link on surreal service * Rename update_surreal_cache() to persist_link_diff() * Temporary perspective data migration from rusqlite to surreal * fmt * fix: address CodeRabbit issues #2, #3, coasys#7 - MIGRATION_REMOVAL_GUIDE.md: Complete sentence in heading - migration.rs: Only mark as migrated when error_count == 0 (prevents data loss) - surreal_service/mod.rs: Remove overly broad 'index' error check (more precise error handling) Addresses CodeRabbit actionable comments on PR coasys#652 * fix: preserve original link status instead of hardcoding Local (issue #4) Instead of hardcoding LinkStatus::Local, now reads link.status and uses it (falls back to Local if None). This preserves the original link status during import operations. Addresses CodeRabbit actionable comment on PR coasys#652 * fix: propagate SurrealDB write failures to prevent desync (issue #1) - retry_surreal_op now returns Result and propagates errors - persist_link_diff now returns Result instead of silently swallowing errors - Updated all callsites: - Functions returning Result: use .await? to propagate - Functions returning (): use .await.expect() to fail-fast - Critical synchronization operations now fail loudly instead of silently Addresses CodeRabbit actionable comment #1 on PR coasys#652 * fix: honor full unique constraint in SurrealDB lookups (issue coasys#6) Updated get_link to accept and use author and timestamp parameters: - get_link now takes optional author and timestamp - When provided, queries using all 5 unique fields (source, target, predicate, author, timestamp) - When not provided, falls back to 3-field lookup for backward compatibility - Updated all callsites to pass author and timestamp from LinkExpression This prevents returning arbitrary links when multiple authors/timestamps exist for the same source/predicate/target combination. Addresses CodeRabbit actionable comment coasys#6 on PR coasys#652 * fix: prevent TOCTOU race in initialize_from_db (issue #5) Added atomic check-and-insert before storing perspective: - Initial read-lock check remains for quick filtering - After async initialization completes, do final write-lock check - Only insert if another task hasn't already initialized this perspective - Discard duplicate work and don't start background tasks if race lost This prevents multiple tasks from creating duplicate SurrealDB services for the same perspective UUID. Addresses CodeRabbit actionable comment #5 on PR coasys#652 * fix: clone link.status to avoid partial move Compilation error: link.status.unwrap_or() moves the value, preventing use of 'link' afterwards. Use clone() to avoid the partial move. * fix: borrow links in migration loop to avoid partial move Changed 'for (link_expr, status) in links' to '&links' and cloned status to avoid moving values out of the vector. * chore: run cargo fmt and add PR fixes summary * refactor: remove redundant variable reassignments in get_link Addresses CodeRabbit feedback: simplified variable flow by directly assigning query result to response instead of going through intermediate query and results variables. Co-authored-by: CodeRabbit AI <coderabbit@example.com> * fix: address CodeRabbit feedback on PR coasys#652 1. Remove 'WHERE perspective = $perspective' from test queries - Each perspective has isolated database, no filtering needed - Fixed 11 test queries in surreal_service/mod.rs 2. Make status parsing case-insensitive in SurrealLink conversion - Now handles 'Shared'/'shared' and 'Local'/'local' correctly - Preserves migrated data regardless of case 3. Require author/timestamp in get_link() signature - Changed from Option<&str> to &str for both params - Removed fallback branch (always use full unique constraint) - Updated 4 callsites in perspective_instance.rs - Enforces UNIQUE index (in, out, predicate, author, timestamp) Co-authored-by: CodeRabbit AI <coderabbit@example.com> * reset bootstrapSeed.json * Handle fallback sync read failure gracefully. * Don’t proceed when migration fails. * Fix inconsistent error handling: .expect() vs. map_err()? * fix: handle SurrealDB service creation failure in initialize_from_db Complements commit e57b8f8 which fixed the same issue in add_perspective(). This fix addresses the spawned task in initialize_from_db() which also had a panicking .expect() call. Changes: - Replace .expect() with match expression - Log error and return early on failure - Prevents panic if SurrealDB creation fails (RocksDB lock, disk, permissions) Addresses CodeRabbit feedback on PR coasys#652 (line 90 issue) * don't fail silently if links from DB can't be parsed * don't panic on DB write failures but log error --------- Co-authored-by: Data <data.coasys@gmail.com> Co-authored-by: CodeRabbit AI <coderabbit@example.com>
- Baseline profiling: 355 MB startup, 750 MB post-init, ~78 MB per neighbourhood - Leak investigation: 0% memory recovery on neighbourhood teardown - perspectiveRemove does not uninstall Holochain hApps or free WASM runtimes - Bare perspectives leak ~2.4 MB each, language cloning leaks ~4.2 MB each - Includes reproduction scripts (profiler, leak tester, publish-langs)
Detailed code-level analysis tracing all three categories of memory leaks: 1. CRITICAL: Neighbourhood teardown leaks 100% - perspectiveRemove only sets a flag, never uninstalls Holochain hApps, Prolog pools, SurrealDB, or JS languages 2. Bare perspectives leak ~2.4 MB each (Prolog pools + SurrealDB not freed) 3. Language cloning leaks ~4.2 MB per clone (permanent, no unload path) Includes exact file/line references, proposed fixes ordered by priority, and architecture recommendations (lifecycle contract, reference counting).
CRITICAL fixes: - Fix 1: Proper teardown_background_tasks that cleans up Prolog pools, SurrealDB, link language, subscribed queries, and batch store - Fix 2: Add language_remove method to Rust LanguageController to call JS languageController.languageRemove() during teardown - Fix 3: Clean up Holochain signal callbacks on language removal (both JS #signalCallbacks and Rust signal stream StreamMap) - Rename _remove_perspective_pool to remove_perspective_pool MEDIUM fixes: - Fix 4: Add reference counting for languages in LanguageController.ts (languageAddRef/languageReleaseRef) - Fix 5: Add SurrealDB shutdown() method that drops all data and indexes
Baseline vs patched binary comparison confirms: - AD4M-layer teardown works correctly (SurrealDB, signals, languages) - Holochain conductor retains ~140MB/neighbourhood after uninstall_app - 0% memory recovery on both original and patched binaries - Root cause is conductor-level wasmer/LMDB memory management Updated leak-investigation.mjs with v2 improvements: - Fixed GQL schema for DecoratedLinkExpression - Added detailed smaps breakdown per test phase - Added large anon mapping tracking across lifecycle - Added teardown log verification
- Remove languageAddRef/languageReleaseRef and #languageRefCounts from LanguageController.ts — these were never called from any code path - Simplify SurrealDB shutdown() to just log — SurrealDB uses in-memory storage (Surreal::new::<Mem>), so explicit DELETE/REMOVE INDEX is unnecessary; memory is freed when the Arc<Surreal<Db>> is dropped
Adds a WASM language runtime that enables AD4M Language modules to be compiled to WebAssembly and executed in the Wasmer runtime (same engine Holochain uses). This eliminates the need for V8/Deno for languages that target WASM, reducing per-language memory overhead. Components: - rust-executor/src/wasm_core/ — WASM loader, ABI, host functions, registry - wasm-language-sdk/ — Rust SDK crate for language authors (types, traits, macros) - examples/wasm-languages/note-store/ — port of note-store language to Rust/WASM Key design: - ABI versioned from day one (AD4M_LANGUAGE_ABI_VERSION = 1) - Fat pointer encoding (u64) for passing data across WASM boundary - JSON serialisation for structured data - Per-language isolation (each gets own WASM instance + linear memory) - Host functions mirror Deno ops: agent_did, agent_sign, hash, etc. - Feature-gated: cargo check --features wasm-languages - Does not break existing Deno/JS language path The example note-store language compiles to a 119KB WASM binary with all required exports (ad4m_alloc, ad4m_dealloc, ad4m_expression_get, etc.) and imports only the host functions it actually uses.
- Add LanguageBackend async trait in languages/language.rs abstracting sync, commit, current_revision, render, others, telepresence methods - Implement LanguageBackend for existing JS Language (unchanged behavior) - Add WasmLanguage backend (feature-gated behind wasm-languages) wrapping WasmLanguageInstance with sensible defaults for unimplemented methods - Update LanguageController::language_by_address to check WASM registry first, falling back to JS - Add install_wasm_language and is_wasm_bundle helpers (wasm-languages) - Update language_remove to handle WASM languages - Update perspective_instance.rs to use Arc<Mutex<dyn LanguageBackend>> instead of concrete Language type - Add async-trait dependency, fix duplicate surrealdb dep in Cargo.toml
- Fix schema.gql symlink (core/lib/src -> tests/js) - Fix AgentContext/did_for_context/sign_for_context -> agent::did()/sign() - Fix create_signed_expression to use 1-arg API - Remove conflicting From<WasmLanguageError> impl (blanket covers it) - Fix perspective_instance to use Box<dyn LanguageBackend> in Arc<Mutex<>> - Add set_app_data_path to perspectives/mod.rs (merge gap) - Forward wasm-languages feature through cli/Cargo.toml
- Add LinksAdapter trait to wasm-language-sdk with sync/commit/render/current_revision/others - Add ad4m_links_adapter! macro for optional WASM export generation - Add has_links_adapter capability detection in host - Add sync/commit/render/current_revision/others methods to WasmLanguageInstance - Wire WasmLanguage backend to call through to WASM instance methods - Add WASM bundle detection in JS LanguageController (magic bytes check) - Add WASM install path in Rust LanguageController.install_language
…9 pass) - New example: link-store WASM language with full LinksAdapter (sync, commit, render, current_revision, others) - Fix HOST_MODULE_NAME: "ad4m" -> "env" to match extern "C" default imports - Remove duplicate inline mod tests from wasm_core/mod.rs - 7 new LinksAdapter tests + rebuilt WASM fixtures
…terface - Update AbiHcCallRequest: replace dna_hash/agent_pubkey with dna_nick - Add tokio_handle to HostEnv for sync->async bridging - Implement host_hc_call using block_in_place + handle.block_on - Use maybe_get_holochain_service() for defensive error handling - Update SDK: new holochain_call(dna_nick, zome_name, fn_name, payload) API - Deprecate old hc_call() in SDK
… and ad4m_init lifecycle hook - Add hc_install_app, hc_remove_app, hc_get_agent_key host functions to wasm_core - Register new host functions in WASM imports - Add ad4m_init lifecycle hook: called after WASM instantiation for DNA setup - Add SDK bindings: holochain_install_app, holochain_remove_app, holochain_get_agent_key - Add LanguageInit trait with default no-op init() method - Generate ad4m_init export in ad4m_language! macro
- New p-diff-sync-wasm example: real Holochain-backed link language - Embeds 1.1MB Perspective-Diff-Sync .happ bundle via include_bytes! - Implements full LinksAdapter (sync, commit, render, current_revision, others) - Uses rmp-serde for msgpack serialization to match zome ABI - DNA installed via ad4m_init lifecycle hook - All zome calls proxied through holochain_call host function - Fix SDK macro: ad4m_teardown was missing closing brace (ad4m_init nested inside) - Make tokio Handle optional in HostEnv (Handle::try_current) - Allows WASM tests to run without tokio runtime - Host functions gracefully return null when no runtime available - 17/17 WASM tests passing (p-diff-sync correctly fails without conductor) - Compiled WASM: 1.4MB (1.1MB DNA + ~300KB code)
- Fix snapshot not being re-embedded (add cargo:rerun-if-changed to build.rs) - Restore is_initialized() guard in agent_load() to prevent crash on fresh data - Add install_wasm_language op to languages extension (JS + Rust) - Add languageInstallWasm GQL mutation for WASM language installation - Route expressionCreate/expressionRaw through WASM backend when applicable - Fix misleading comments about host module namespace (env, not ad4m) All 21 WASM unit tests passing. Integration test: agent gen, perspective CRUD, WASM install, expression ops all working.
- Add app_data_path to LanguageController for Rust-native path resolution - Implement install_language WASM detection: checks local bundle.wasm, then fetches from language language and detects base64-encoded WASM (AGFzbQ magic prefix), then falls back to JS install - Add install_wasm_from_base64: decodes, verifies WASM magic, saves to languages dir, registers in WASM runtime - Add publish_wasm_language: base64-encodes WASM binary, adds bundleType:wasm to meta, publishes via language language - Add languagePublishWasm GQL mutation - language_source query returns base64 WASM for WASM languages - Integration test v4: 10/10 tests passing (install, expressions, source query, perspective links, publish, base64 detection, memory) - 21/21 WASM unit tests passing
- Add LanguageInit impl to note-store and link-store examples (macro requires it) - Add rustup default stable to container-based CI jobs (coasys/ad4m-ci-linux container lacks default toolchain)
- Fix p-diff-sync teardown to use stored app_id instead of agent DID - Error on invalid meta JSON in publish_wasm_language instead of silent fallback - Delete bundle files on WASM language removal - Fix CI workflow: use github.head_ref for PR branch detection
The coasys/ad4m-ci-linux container was timing out (1h35m) on GitHub Actions runners. Switch to installing deps directly — matches what the WASM SDK job already does successfully.
bundle.js and CUSTOM_DENO_SNAPSHOT.bin are embedded at compile time but only built by the JS build step. Create placeholder files so cargo check can pass without a full JS build.
surrealdb-rocksdb takes 50+ min to compile from source on free runners. The container image has it pre-built. Keep WASM SDK on bare runner (fast). Bump timeout to 120min for container pull + compile.
bd50f7d to
151eb46
Compare
HexaField
pushed a commit
that referenced
this pull request
Mar 6, 2026
…rdcoding Address review comments #3 and #4 from Nico: - subscriptions.rs: When no explicit predicate is provided, load the SHACL class definition and derive the SurrealQL query from its property predicates (IN clause). No more hardcoded flux:// types. - shacl.rs: Enrich ShaclClass with shape_uri and all_predicates fields, providing enough metadata to construct targeted queries without type-specific knowledge. - Add load_class_properties_with_uri() to return shape URI alongside properties for richer metadata.
HexaField
added a commit
that referenced
this pull request
Apr 21, 2026
- Remove duplicate main.rs from rust-executor (Nico #1) - Rename SparqlService → SparqlStore, move to perspectives/ (Nico #2) - Remove oxigraph feature flag, make it a regular dependency (Nico #4) - Extract duplicate SPARQL query logic in ModelQueryBuilder (Nico coasys#7) - Replace all SurrealDB syntax with native SPARQL in: - decorators.ts: buildConformanceFilter generates SPARQL getters (Nico #2, James coasys#9) - query-utils.ts: buildWhereCondition generates SPARQL patterns (James coasys#11) - shacl-gen.ts: getter generation uses SPARQL (James coasys#13) - hydration.ts: remove convertGetterToSPARQL, use prepareGetterQuery with native SPARQL support + legacy fallback (Nico coasys#8, James coasys#10) - relation-filtering.test.ts: update expectations (James coasys#12) - model-getters.test.ts: update to SPARQL syntax (James coasys#14) - prolog-and-literals.test.ts: update to SPARQL syntax (James coasys#15)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Combined branch: memory leak investigation/fixes + WASM language runtime for AD4M.
Memory Leak Fixes (from
profiling/memory-investigation)uninstall_app— see upstream PR #689WASM Language Runtime
rust-executor/src/wasm_core/— WASM loader, ABI v1, host functions, registrywasm-language-sdk/— Rust SDK crate for language authors (types, traits,ad4m_language!macro)examples/wasm-languages/note-store/— port of note-store to Rust WASM (121KB binary)wasm-languages— zero impact when disabledBuild & Test Results (Ubuntu 22.04, x86_64, 32GB RAM)
moldlinker on 32GB machines (GNUldOOMs during link)What's Needed Next
.wasmvs.jsand route to Rust WASM runtime (part of Nico's JS→Rust refactor)#[link(wasm_import_module = "ad4m")]in SDKProfiling Docs
docs/profiling/README.md— overview and reproduction stepsdocs/profiling/leak-investigation.mjs— comprehensive profiler script