Skip to content

Commit 03e3a9b

Browse files
committed
feat(neo4j): add weighted depth-by-depth traversal query
Add EXPLORE_DEPTH_LEVEL Cypher query that scores nodes using: - Edge weight from relationship properties - Cosine similarity between node and query embeddings - Exponential depth decay penalty
1 parent da20992 commit 03e3a9b

File tree

1 file changed

+111
-0
lines changed

1 file changed

+111
-0
lines changed

src/storage/neo4j/neo4j.service.ts

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -179,4 +179,115 @@ export const QUERIES = {
179179
} as result
180180
`;
181181
},
182+
183+
/**
184+
* DEPTH-BY-DEPTH WEIGHTED TRAVERSAL
185+
*
186+
* This query is called once per depth level, allowing you to score and prune
187+
* at each level before deciding which nodes to explore further.
188+
*
189+
* Parameters:
190+
* $sourceNodeIds: string[] - Node IDs to explore FROM (starts with just start node)
191+
* $visitedNodeIds: string[] - Node IDs already visited (to avoid cycles)
192+
* $queryEmbedding: number[] - The original query embedding for similarity scoring
193+
* $currentDepth: number - Which depth level we're at (1-indexed)
194+
* $depthDecay: number - Decay factor per depth (e.g., 0.85 means 15% penalty per level)
195+
* $maxNodesPerDepth: number - Maximum nodes to return at this depth
196+
* $direction: 'OUTGOING' | 'INCOMING' | 'BOTH'
197+
*
198+
* How it works:
199+
*
200+
* 1. UNWIND $sourceNodeIds - For each node we're exploring FROM
201+
* 2. MATCH neighbors - Find all immediate neighbors (1 hop only)
202+
* 3. Filter out visited nodes - Avoid cycles
203+
* 4. Score each neighbor using:
204+
* - edgeWeight: The relationshipWeight we added to edges (how important is this relationship type?)
205+
* - nodeSimilarity: Cosine similarity between neighbor's embedding and query embedding
206+
* - depthPenalty: Exponential decay based on current depth
207+
* 5. Combine: score = edgeWeight * nodeSimilarity * depthPenalty
208+
* 6. ORDER BY score DESC, LIMIT to top N
209+
* 7. Return scored neighbors - caller decides which to explore at next depth
210+
*
211+
* Example flow:
212+
* Depth 1: sourceNodeIds=[startNode], returns top 5 neighbors with scores
213+
* Depth 2: sourceNodeIds=[top 3 from depth 1], returns top 5 neighbors of those
214+
* Depth 3: sourceNodeIds=[top 3 from depth 2], returns top 5 neighbors of those
215+
* ...until maxDepth reached or no more neighbors
216+
*/
217+
EXPLORE_DEPTH_LEVEL: (direction: 'OUTGOING' | 'INCOMING' | 'BOTH' = 'BOTH', maxNodesPerDepth: number = 5) => {
218+
// Build relationship pattern based on direction
219+
let relPattern = '';
220+
if (direction === 'OUTGOING') {
221+
relPattern = '-[rel]->';
222+
} else if (direction === 'INCOMING') {
223+
relPattern = '<-[rel]-';
224+
} else {
225+
relPattern = '-[rel]-';
226+
}
227+
228+
return `
229+
// Unwind the source nodes we're exploring from
230+
UNWIND $sourceNodeIds AS sourceId
231+
MATCH (source) WHERE source.id = sourceId
232+
233+
// Find immediate neighbors (exactly 1 hop)
234+
MATCH (source)${relPattern}(neighbor)
235+
236+
// Filter: skip already visited nodes to avoid cycles
237+
WHERE NOT neighbor.id IN $visitedNodeIds
238+
239+
// Calculate the three scoring components
240+
WITH source, neighbor, rel,
241+
242+
// 1. Edge weight: how important is this relationship type?
243+
// Falls back to 0.5 if not set
244+
COALESCE(rel.relationshipWeight, 0.5) AS edgeWeight,
245+
246+
// 2. Node similarity: how relevant is this node to the query?
247+
// Uses cosine similarity if neighbor has an embedding
248+
// Falls back to 0.5 if no embedding (structural nodes like decorators)
249+
CASE
250+
WHEN neighbor.embedding IS NOT NULL AND $queryEmbedding IS NOT NULL
251+
THEN vector.similarity.cosine(neighbor.embedding, $queryEmbedding)
252+
ELSE 0.5
253+
END AS nodeSimilarity,
254+
255+
// 3. Depth penalty: exponential decay
256+
// depth 1: decay^0 = 1.0 (no penalty)
257+
// depth 2: decay^1 = 0.85 (if decay=0.85)
258+
// depth 3: decay^2 = 0.72
259+
// This ensures closer nodes are preferred
260+
($depthDecay ^ ($currentDepth - 1)) AS depthPenalty
261+
262+
// Combine into final score
263+
WITH source, neighbor, rel, edgeWeight, nodeSimilarity, depthPenalty,
264+
(edgeWeight * nodeSimilarity * depthPenalty) AS combinedScore
265+
266+
// Return all neighbor data with scores
267+
RETURN {
268+
node: {
269+
id: neighbor.id,
270+
labels: labels(neighbor),
271+
properties: apoc.map.removeKeys(properties(neighbor), ['embedding'])
272+
},
273+
relationship: {
274+
type: type(rel),
275+
startNodeId: startNode(rel).id,
276+
endNodeId: endNode(rel).id,
277+
properties: properties(rel)
278+
},
279+
sourceNodeId: source.id,
280+
scoring: {
281+
edgeWeight: edgeWeight,
282+
nodeSimilarity: nodeSimilarity,
283+
depthPenalty: depthPenalty,
284+
combinedScore: combinedScore
285+
}
286+
} AS result
287+
288+
// Sort by score and limit to top N per depth
289+
ORDER BY combinedScore DESC
290+
LIMIT ${maxNodesPerDepth}
291+
`;
292+
},
182293
};

0 commit comments

Comments
 (0)