There's a bug in the planner that over-eagerly prunes aliases that are required in WHERE clauses for filtering a query's results. I noticed it when writing multi-match queries (Queries that contain more than 2 match clauses) and a predicate filter.
In the planning stage, aliases that are referenced in the WHERE clause (not in RETURN) can get pruned from the logical plan before the filter is built. When the planner then tries to translate WHERE alias.prop into a filter, the schema no longer contains alias__prop, so queries that are perfectly reasonable (and work in Kuzu/Ladybug/Neo4j) fail in lance-graph.
Repro
The following code reproduces the issue.
import pyarrow as pa
from lance_graph import CypherQuery, GraphConfig
cfg = (
GraphConfig.builder()
.with_node_label("Person", "id")
.with_node_label("City", "id")
.with_node_label("Country", "id")
.with_node_label("Hobby", "id")
.with_relationship("livesIn", "src", "dst")
.with_relationship("hasHobby", "src", "dst")
.with_relationship("inCountry", "src", "dst")
.build()
)
datasets = {
"Person": pa.table({"id": [1]}),
"City": pa.table({"id": [10], "name": ["Paris"]}),
"Country": pa.table({"id": [100], "name": ["France"]}),
"Hobby": pa.table({"id": [20], "name": ["Chess"]}),
"livesIn": pa.table({"src": [1], "dst": [10]}),
"hasHobby": pa.table({"src": [1], "dst": [20]}),
"inCountry": pa.table({"src": [10], "dst": [100]}),
}
query = """
MATCH (c:City)-[:inCountry]->(co:Country),
(p:Person)-[:livesIn]->(c),
(p)-[:hasHobby]->(h:Hobby)
WHERE co.name = "France" AND h.name = "Chess"
RETURN p.id AS id
"""
result = CypherQuery(query).with_config(cfg).execute(datasets)
print(result)
Gives:
Traceback (most recent call last):
File "/Users/prrao/code/graph-benchmark-ldbc-snb/lance_graph/t.py", line 35, in <module>
result = CypherQuery(query).with_config(cfg).execute(datasets)
ValueError: Query planning error: Failed to build filter: Schema error: No field named co__name. Did you mean 'c__name'?.
This can happen regardless of whether the alias is on the left or right side of a pattern. The common factor to reproduce the issue is that there's a multi-match clause that attempts to apply a filter before returning.
Environment
The following environment was used to test this:
lance-graph 0.4.0
Python 3.13
macOS Tahoe 26.2
There's a bug in the planner that over-eagerly prunes aliases that are required in
WHEREclauses for filtering a query's results. I noticed it when writing multi-match queries (Queries that contain more than 2 match clauses) and a predicate filter.In the planning stage, aliases that are referenced in the
WHEREclause (not in RETURN) can get pruned from the logical plan before the filter is built. When the planner then tries to translateWHERE alias.propinto a filter, the schema no longer containsalias__prop, so queries that are perfectly reasonable (and work in Kuzu/Ladybug/Neo4j) fail in lance-graph.Repro
The following code reproduces the issue.
Gives:
This can happen regardless of whether the alias is on the left or right side of a pattern. The common factor to reproduce the issue is that there's a multi-match clause that attempts to apply a filter before returning.
Environment
The following environment was used to test this: