fix(phenoData): surreal-bridge SurrealDB v3 API migration by KooshaPari · Pull Request #47 · KooshaPari/phenoData

KooshaPari · 2026-05-02T21:17:55Z

User description

Summary

SurrealDB v3 removed surrealdb::sql::Thing, surrealdb::sql::Value, the SurrealValue trait, and the typed .create().content() API.

Changes

RecordId is now a String type alias ("table:id" format) instead of a struct with surrealdb::sql::Thing
All record/Value operations replaced with serde_json::Value
Used raw SQL via db.query() for CREATE ... CONTENT $data RETURN id
select() returns Vec<serde_json::Value> directly deserializable to domain structs
Added extract_record_id() helper handling v3's id-object format ({"tb": "...", "id": "..."})
Added tempfile dev-dependency for test

Test

cargo test --package surreal-bridge → 1 pass, 0 fail

COAUTHORED_BY: Claude Opus 4.7 noreply@anthropic.com

Note

Medium Risk
Updates persistence/query code to use SurrealDB v3’s raw SQL + serde_json::Value flow and changes record ID representation, which can affect data serialization/deserialization and query results. Scope is contained to surreal-bridge plus small config additions.

Overview
Migrates crates/surreal-bridge to SurrealDB v3 by replacing typed record APIs with raw db.query() CREATE ... CONTENT $data RETURN id and serde_json::Value-based (de)serialization for Skill/Embedding and vector search results.

Changes record IDs to a String ("table:id") with a new extract_record_id() helper, updates tests accordingly (adds tempfile dev-dependency), and adds repo hygiene configs (FUNDING.yml update and new trufflehog.yml).

^{Reviewed by Cursor Bugbot for commit 0a425f0. Bugbot is set up for automated code reviews on this repo. Configure here.}

CodeAnt-AI Description

Move SurrealDB storage to the v3-compatible format

What Changed

Skill and embedding records now save and load through SurrealDB v3’s JSON-based flow, so the bridge keeps working with the newer database API
Record IDs are now returned as plain table:id strings, and existing ID shapes from v3 are handled when records are created
Skill and search results are converted from raw database output into app data before use, so queries still return usable records
Added repo funding and secret-scan config files, and updated the bridge test setup for the new database path format

Impact

✅ SurrealDB v3 compatibility
✅ Fewer record creation failures
✅ Working skill and embedding lookups

🔄 Retrigger CodeAnt AI Review

Details

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

SurrealDB v3 removed `surrealdb::sql::Thing`, `surrealdb::sql::Value`, the `SurrealValue` trait, and the typed `.create().content()` API. - Replaced all record/Value operations with serde_json::Value - `RecordId` is now a String alias ("table:id" format) - Used raw SQL via `db.query()` for CREATE + RETURN id - `select()` returns Vec<serde_json::Value> directly - Added `extract_record_id()` helper for v3's id-object format - Added tempfile dev-dependency for test COAUTHORED_BY: Claude Opus 4.7 <noreply@anthropic.com>

gemini-code-assist · 2026-05-02T21:17:58Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

codeant-ai · 2026-05-02T21:17:58Z

CodeAnt AI is reviewing your PR.

Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

coderabbitai · 2026-05-02T21:18:01Z

Warning

Rate limit exceeded

@KooshaPari has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 27 minutes and 53 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3d54c106-621b-4b90-ba9b-5a0232b88eaa

📥 Commits

Reviewing files that changed from the base of the PR and between 94394dc and 0a425f0.

📒 Files selected for processing (4)

FUNDING.yml
crates/surreal-bridge/Cargo.toml
crates/surreal-bridge/src/lib.rs
trufflehog.yml

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/surreal-v3-migration

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch fix/surreal-v3-migration

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 0/1 reviews remaining, refill in 27 minutes and 53 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codeant-ai · 2026-05-02T21:20:24Z

+                "SELECT *, vector::distance::cosine(embedding, $query) AS score \
+                 FROM embedding ORDER BY score ASC LIMIT $limit",
+            )
+            .bind(("query", serde_json::json!(query)))
            .bind(("limit", limit))
            .await?
            .take(0)?;
-
-        Ok(results)
+
+        let scored: Vec<ScoredEmbedding> = results
+            .into_iter()
+            .filter_map(|r| serde_json::from_value(r).ok())
+            .collect();


🟠 Architect Review — HIGH

search_similar() uses the embedding field in its SQL but persisted records are written from Embedding { vector: ... }, and ScoredEmbedding also expects an embedding field; this schema mismatch means deserialization into ScoredEmbedding fails and filter_map(... .ok()) silently drops all rows, so similarity searches return empty/incomplete results in normal usage.

Suggestion: Align the write/read/search schema on a single vector field name across Embedding, the SQL query, and ScoredEmbedding, and treat decode failures as errors rather than silently omitting rows. Add an end-to-end test that stores an embedding then successfully retrieves it via search_similar().

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖

This is an **Architect / Logical Review** comment left during a code review. These reviews are first-class, important findings — not optional suggestions. Do NOT dismiss this as a 'big architectural change' just because the title says architect review; most of these can be resolved with a small, localized fix once the intent is understood. **Path:** crates/surreal-bridge/src/lib.rs **Line:** 81:92 **Comment:** *HIGH: `search_similar()` uses the `embedding` field in its SQL but persisted records are written from `Embedding { vector: ... }`, and `ScoredEmbedding` also expects an `embedding` field; this schema mismatch means deserialization into `ScoredEmbedding` fails and `filter_map(... .ok())` silently drops all rows, so similarity searches return empty/incomplete results in normal usage. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. If a suggested approach is provided above, use it as the authoritative instruction. If no explicit code suggestion is given, you MUST still draft and apply your own minimal, localized fix — do not punt back with 'no suggestion provided, review manually'. Keep the change as small as possible: add a guard clause, gate on a loading state, reorder an await, wrap in a conditional, etc. Do not refactor surrounding code or expand scope beyond the finding. Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is ON, but it could not run because the branch was deleted or merged before autofix could start.}

^{Reviewed by Cursor Bugbot for commit 0a425f0. Configure here.}

cursor · 2026-05-02T21:21:02Z

-            .bind(("query", query))
+        let results: Vec<serde_json::Value> = self.db
+            .query(
+                "SELECT *, vector::distance::cosine(embedding, $query) AS score \


Field name mismatch causes empty search results

High Severity

The Embedding struct stores its vector data in a field called vector, but the SQL query in search_similar references a field called embedding (vector::distance::cosine(embedding, $query)), and the ScoredEmbedding struct expects a field called embedding rather than vector. The cosine distance computation operates on a non-existent field, and every result fails deserialization — but filter_map with .ok() silently swallows all errors, so search_similar always returns an empty Vec instead of surfacing the failure.

Additional Locations (1)

crates/surreal-bridge/src/lib.rs#L156-L157

^{Reviewed by Cursor Bugbot for commit 0a425f0. Configure here.}

cursor · 2026-05-02T21:21:02Z

+        let skills: Vec<Skill> = records
+            .into_iter()
+            .filter_map(|r| serde_json::from_value(r).ok())
+            .collect();


Silent deserialization drops mask data retrieval failures

Medium Severity

Both query_skills and search_similar use filter_map with .ok() to silently discard any records that fail to deserialize. If SurrealDB v3 returns the id field in object format ({"tb": "...", "id": "..."}) — the very format extract_record_id was added to handle — deserializing it into Option<String> will fail, and every record will be silently dropped, returning an empty Vec with no error.

Additional Locations (1)

crates/surreal-bridge/src/lib.rs#L88-L92

^{Reviewed by Cursor Bugbot for commit 0a425f0. Configure here.}

codeant-ai · 2026-05-02T21:21:46Z

 use surrealdb::engine::local::{Db, RocksDb};
 use surrealdb::Surreal;

+pub type RecordId = String;


Suggestion: Using a plain String alias for record IDs is incompatible with SurrealDB v3 responses that can emit structured ID objects ({"tb": "...", "id": ...}), so direct struct deserialization will fail for those records. Replace this alias with a custom ID type/deserializer that accepts both string and object ID formats. [type error]

Severity Level: Critical 🚨

- ❌ RecordId = String mismatches SurrealDB v3 ID objects. - ⚠️ Skills and embeddings fail deserialization when IDs are objects.

Steps of Reproduction ✅

1. Note that SurrealDB v3 can return record IDs in an object form; the helper `extract_record_id` defined at `crates/surreal-bridge/src/lib.rs:97-116` explicitly documents and handles the case where a response is `{"id": {"tb": "skill", "id": "..."}}`, converting that SurrealDB ID object into a `"table:id"` string. 2. For read operations, `PhenoSurreal::query_skills` at `crates/surreal-bridge/src/lib.rs:50-56` calls `self.db.select("skill")`, which returns JSON rows including an `"id"` field representing the record ID; in v3, this `"id"` may legitimately be an object (e.g., `{"tb": "skill", "id": "..."}`) rather than a plain `"skill:..."` string. 3. The `Skill` struct at `crates/surreal-bridge/src/lib.rs:119-129` declares `pub id: Option<RecordId>` with `RecordId` aliased to `String` at `crates/surreal-bridge/src/lib.rs:15`, so `serde_json::from_value::<Skill>` used in `query_skills` (lines 52-55) expects the JSON `"id"` field to be a string; when SurrealDB returns the documented object form, deserialization fails with a type error because a map is provided where a string is expected. 4. Because `query_skills` wraps `serde_json::from_value` in `.ok()` and `filter_map` (crates/surreal-bridge/src/lib.rs:52-55), any row whose `"id"` is an object is silently dropped, meaning that callers of `query_skills` (and any future reads using `Embedding` or `ScoredEmbedding` IDs) lose access to valid records solely because `RecordId` is a plain `String` alias instead of a custom type or deserializer capable of accepting both string and structured ID-object formats used by SurrealDB v3.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** crates/surreal-bridge/src/lib.rs **Line:** 15:15 **Comment:** *Type Error: Using a plain `String` alias for record IDs is incompatible with SurrealDB v3 responses that can emit structured ID objects (`{"tb": "...", "id": ...}`), so direct struct deserialization will fail for those records. Replace this alias with a custom ID type/deserializer that accepts both string and object ID formats. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix

👍 | 👎

codeant-ai · 2026-05-02T21:21:46Z

+                "SELECT *, vector::distance::cosine(embedding, $query) AS score \
+                 FROM embedding ORDER BY score ASC LIMIT $limit",


Suggestion: The vector similarity query references embedding, but stored records use the vector field, so the cosine expression will evaluate against a non-existent field and fail (or produce invalid scores). Use the stored vector field name in the query so scoring runs against real data. [logic error]

Severity Level: Critical 🚨

- ❌ search_similar fails due to querying nonexistent embedding column. - ⚠️ Blocks vector similarity features using PhenoSurreal::search_similar API.

Steps of Reproduction ✅

1. Initialize an embedded SurrealDB instance using `PhenoSurreal::new` defined in `crates/surreal-bridge/src/lib.rs:22-28`, which creates a `Surreal<Db>` and selects the `pheno` namespace and `main` database. 2. From any caller (e.g., future API layer or test), invoke `PhenoSurreal::search_similar(&[0.1_f32; 3], 10)` implemented at `crates/surreal-bridge/src/lib.rs:78-94`. 3. Inside `search_similar`, SurrealDB executes the query string at `crates/surreal-bridge/src/lib.rs:81-82`: `SELECT *, vector::distance::cosine(embedding, $query) AS score FROM embedding ORDER BY score ASC LIMIT $limit`, which computes cosine distance against field `embedding`, even though the stored records defined by `Embedding` at `crates/surreal-bridge/src/lib.rs:145-151` use the field `vector: Vec<f32>` as the embedding column. 4. Because the `embedding` column does not exist on the `embedding` table (the actual numeric vector is in `vector`), SurrealDB v3 will evaluate `vector::distance::cosine(embedding, $query)` against a non-existent field, causing the query to fail or produce `NULL` scores, and the `.await?` at `crates/surreal-bridge/src/lib.rs:86` will return an error instead of valid similarity results for any caller of `search_similar`.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** crates/surreal-bridge/src/lib.rs **Line:** 81:82 **Comment:** *Logic Error: The vector similarity query references `embedding`, but stored records use the `vector` field, so the cosine expression will evaluate against a non-existent field and fail (or produce invalid scores). Use the stored vector field name in the query so scoring runs against real data. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix

👍 | 👎

codeant-ai · 2026-05-02T21:21:50Z

CodeAnt AI finished reviewing your PR.

Phenotype Agent added 2 commits May 2, 2026 13:54

chore: bootstrap FUNDING.yml trufflehog.yml

2be7cc0

Copilot AI review requested due to automatic review settings May 2, 2026 21:17

resolve: keep our FUNDING.yml from surreal-v3-migration

0a425f0

Copilot started reviewing on behalf of KooshaPari May 2, 2026 21:18 View session

KooshaPari merged commit f141624 into main May 2, 2026
6 of 9 checks passed

KooshaPari deleted the fix/surreal-v3-migration branch May 2, 2026 21:18

codeant-ai Bot added the size:L This PR changes 100-499 lines, ignoring generated files label May 2, 2026

codeant-ai Bot reviewed May 2, 2026

View reviewed changes

cursor Bot reviewed May 2, 2026

View reviewed changes

codeant-ai Bot reviewed May 2, 2026

View reviewed changes

KooshaPari review requested due to automatic review settings May 2, 2026 21:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(phenoData): surreal-bridge SurrealDB v3 API migration#47

fix(phenoData): surreal-bridge SurrealDB v3 API migration#47
KooshaPari merged 3 commits into
mainfrom
fix/surreal-v3-migration

KooshaPari commented May 2, 2026 •

edited by codeant-ai Bot

Loading

Uh oh!

gemini-code-assist Bot commented May 2, 2026

Uh oh!

codeant-ai Bot commented May 2, 2026

Uh oh!

coderabbitai Bot commented May 2, 2026 •

edited

Loading

Rate limit exceeded

Uh oh!

Uh oh!

codeant-ai Bot May 2, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 2, 2026

Uh oh!

cursor Bot May 2, 2026

Uh oh!

codeant-ai Bot May 2, 2026

Uh oh!

codeant-ai Bot May 2, 2026

Uh oh!

codeant-ai Bot commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		"SELECT *, vector::distance::cosine(embedding, $query) AS score \
		FROM embedding ORDER BY score ASC LIMIT $limit",

Conversation

KooshaPari commented May 2, 2026 • edited by codeant-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

Summary

Changes

Test

CodeAnt-AI Description

What Changed

Impact

Checking Your Pull Request

Talking to CodeAnt AI

Example

Preserve Org Learnings with CodeAnt

Example

Retrigger review

Check Your Repository Health

Uh oh!

gemini-code-assist Bot commented May 2, 2026

Uh oh!

codeant-ai Bot commented May 2, 2026

Thanks for using CodeAnt! 🎉

Uh oh!

coderabbitai Bot commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Uh oh!

Uh oh!

codeant-ai Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 2, 2026

Choose a reason for hiding this comment

Field name mismatch causes empty search results

Uh oh!

cursor Bot May 2, 2026

Choose a reason for hiding this comment

Silent deserialization drops mask data retrieval failures

Uh oh!

codeant-ai Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

codeant-ai Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

codeant-ai Bot commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

KooshaPari commented May 2, 2026 •

edited by codeant-ai Bot

Loading

coderabbitai Bot commented May 2, 2026 •

edited

Loading