feat(worlds): add worlds provider with neurosymbolic ingestion for MemoryBench by EthanThatOneKid · Pull Request #4 · wazootech/worlds-client-memorybench

EthanThatOneKid · 2026-05-26T03:45:41Z

Summary

Add @worlds/client as a MemoryBench provider (-p worlds) with a neurosymbolic typed-graph ingestion pipeline.

Motivation

Enables @worlds/client to be evaluated on published memory/RAG benchmarks (LoCoMo, LongMemEval) using the MemoryBench framework, alongside Supermemory, Mem0, Zep, etc. — without duplicating harness code or blurring CI semantics in worlds-client-evals.

What changed

Provider bootstrap

Vendored supermemoryai/memorybench (MIT) as the baseline harness
Added @worlds/client (jsr:@jsr/worlds__client@^0.0.14), pinned to same version family as worlds-client-evals
Added @comunica/query-sparql-rdfjs-lite (required by @worlds/client adapters)
New WorldsProvider (src/providers/worlds/):
- In-memory LibSQL backend (self-contained, no external service)
- Session messages ingested as RDF Turtle via client.import()
- client.rebuildSearchIndex() after ingest for FTS/vector discoverability
- client.search({ query }) for retrieval
- Registered as worlds provider in CLI (-p worlds)
Added worlds to ProviderName union and getProviderConfig() (uses OPENAI_API_KEY for judge)
Added .env.example and src/providers/worlds/README.md documenting the phase mapping and smoke run

Neurosymbolic ingestion harness (Phase 1)

ontology.ts: Namespace constants for RDF/RDFS/OWL/XSD, schema.org, PROV-O, SKOS, and a worlds: custom namespace (https://worlds.wazoo.dev/). Exports a reusable TURTLE_PREFIXES block.
shapes.ts: SHACL shape definitions (SessionShape, MessageShape, ClaimShape stub) with a structural validateGraph() checker that runs during ingestion.
index.ts: Sessions are dual-typed as schema:Conversation + prov:Activity, messages as schema:Message + prov:Entity with schema:text, schema:position, schema:author, schema:hasPart, and prov:wasGeneratedBy provenance links. Replaces inline URI strings with ontology constants.

Design decisions

Turtle, not N3: Turtle is a strict subset of N3. The import pipeline parses Turtle natively; N3 extras (formulas, quantifiers) would be silently dropped.
PROV-O for provenance: Tracks where facts came from (which session produced which message). Scaffolding for Phase 2 claim extraction (prov:wasDerivedFrom).
schema:text as literal: Phase 1 reifies message content on typed nodes with provenance. Phase 2 adds claim decomposition on top.

LoCoMo smoke test

bun install
cp .env.example .env.local   # add API keys
bun run src/index.ts run -p worlds -b locomo -l 5 -j gpt-4o -m gemini-3.1-flash-lite

Relationship to worlds-client-evals

Signal	Repo
`sparql-handoff-valid`, `updates-blocked`, step budget	`worlds-client-evals`
LoCoMo QA accuracy, search latency, context tokens (MemScore)	`worlds-client-memorybench`

Closes wazootech/worlds-client-evals#34

- Vendor memorybench from supermemoryai/memorybench (MIT) as baseline - Add @worlds/client (jsr:@jsr/worlds__client@^0.0.14) dependency, same version as worlds-client-evals - Add @comunica/query-sparql-rdfjs-lite as required by @worlds/client adapters - Implement WorldsProvider: in-memory LibSQL, RDF/Turtle ingest, rebuildSearchIndex, client.search() - Add worlds to ProviderName union and register in providers map - Add worlds case to getProviderConfig() (uses OPENAI_API_KEY for judge) - Add .env.example with required API keys - Add src/providers/worlds/README.md documenting phase mapping and smoke run - Fix formatting across upstream files via prettier

feat(worlds): neurosymbolic ingestion harness

… prompt - Accept questionDate parameter for temporal reasoning on LoCoMo questions - Strip RDF metadata (subject, predicate, graph URIs) in prompt builder, showing only text + relevance score to reduce token waste - Keep full search results in search() for show-failures debuggability - Add step-by-step reasoning format matching supermemory prompt pattern

FTS5 uses AND between tokens after stopword removal, so long questions like "When did Caroline go to the LGBTQ support group?" match nothing. When the full query returns empty, fall back to per-term OR-style search with best-score dedup. Also log rebuildSearchIndex quad/chunk counts for indexing diagnostics. LoCoMo -l 5 now completes: 20% accuracy, Hit@4=60%, MemScore 20%/35ms/2306tok.

Wire GeminiEmbeddingService into @worlds/client search index using @ai-sdk/google embedMany(). 768-dim vectors fused with FTS5 keywords via Reciprocal Rank Fusion. Graceful degradation to keyword-only mode when GOOGLE_GENERATIVE_AI_API_KEY is absent.

…ch chunks - config.ts: worlds provider now uses googleApiKey (was openaiApiKey) - gemini-embedding-service: switch from deprecated text-embedding-004 to gemini-embedding-2 with 768d output via outputDimensionality - gemini-embedding-service: chunk embed() calls at 100 items to stay within BatchEmbedContents API limit - index.ts: use searchIndexOnImport:false to defer all vectorization to rebuildSearchIndex() in the indexing phase - README: document two-step ingest-once/iterate workflow with -f search Smoke test: 0% -> 40% accuracy, Hit@10 60% -> 80%, MRR 0.18 -> 0.36

…eaker attribution Enrich search results with session metadata via a batched SPARQL query before passing context to the answer LLM. Stores speaker names (schema:creator) and session participants (worlds:speakerA/B) during ingestion, resolves them at search time, and surfaces them in the prompt with improved temporal reasoning and speaker attribution instructions. smoke-004: 80% accuracy (4/5), up from 40% in smoke-003.

…bility Extract structured claims at ingest (retry, disk cache) and query them at search time via entity-aware SPARQL. Interleave facts with hybrid search hits, rate-limit embeddings, and use 3-pass majority voting plus equivalence rubric for more consistent judge scores.

Add a 5-question agent runner with AI SDK tool calling, billing check, and JSONL trace logging for prompt/tool iteration via replay. Expose a WorldsProvider client getter for SPARQL tooling.

GitHub Sync Agent and others added 13 commits May 25, 2026 20:28

Persist Worlds provider LibSQL state

b9cdb6c

feat(worlds): add ontology module, SHACL shapes, and typed ingestion

9bb70d4

fix(ontology): use worlds.wazoo.dev namespace

cc62ff8

Merge pull request #3 from wazootech/feat/neurosymbolic-ingestion

e7343e4

feat(worlds): neurosymbolic ingestion harness

chore: normalize line endings (LF → CRLF) and apply formatter

8df71c9

feat(worlds): add agent smoke harness with tool calling

67537d3

Add a 5-question agent runner with AI SDK tool calling, billing check, and JSONL trace logging for prompt/tool iteration via replay. Expose a WorldsProvider client getter for SPARQL tooling.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(worlds): add worlds provider with neurosymbolic ingestion for MemoryBench#4

feat(worlds): add worlds provider with neurosymbolic ingestion for MemoryBench#4
EthanThatOneKid wants to merge 13 commits into
mainfrom
feat/worlds-provider

EthanThatOneKid commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

EthanThatOneKid commented May 26, 2026

Summary

Motivation

What changed

Provider bootstrap

Neurosymbolic ingestion harness (Phase 1)

Design decisions

LoCoMo smoke test

Relationship to worlds-client-evals

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant