Sort nodeset on demand#330
Open
tompng wants to merge 1 commit into
Open
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adjusts REXML XPath evaluation to return consistently ordered node-sets (via centralized sorting) and updates tests to handle XPath primitive results returned as single-element arrays.
Changes:
- Centralizes ordering by using
XPathParser.sort(...)inmatchand some call sites, and makessorta class method. - Simplifies
step(...)by always de-duplicating via identitySetand removing theaxis_orderparameter. - Updates the Jaxen test helper to unwrap primitive XPath results before calling
REXML::Functions.string.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| test/test_jaxen.rb | Unwraps primitive XPath results (single-element arrays) to keep Functions.string calls compatible. |
| lib/rexml/xpath_parser.rb | Reworks node-set ordering and de-duplication; changes sort to a class method; removes reverse-axis handling in step. |
| lib/rexml/functions.rb | Sorts node-sets before iterating / stringifying to make results deterministic. |
Comments suppressed due to low confidence (1)
lib/rexml/xpath_parser.rb:1
stepno longer sorts the merged node-set (it used to callsort(...)for the multi-nodeset case). Returningnodes.to_amakes ordering dependent on insertion order throughSet, which can change predicate behavior where ordering is significant (e.g., later[1]filters orposition()), and can introduce non-determinism across Ruby versions/Set behavior. Consider sorting the merged result before returning (and keep de-duplication by identity).
# frozen_string_literal: false
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
155
to
161
| result = expr(path_stack, nodeset) | ||
| case result | ||
| when Array # nodeset | ||
| result.uniq | ||
| XPathParser.sort(result) | ||
| else | ||
| [result] | ||
| end |
Comment on lines
87
to
89
| XPathParser.sort(node_set.to_a).each do |node| | ||
| result << yield(node) if node.respond_to?(:namespace) | ||
| end |
Merged
naitoh
pushed a commit
that referenced
this pull request
Jun 14, 2026
Refactor `step` so that nodeset materialization is deferred. Instead of building the full nodeset up front and filtering through predicates, each axis returns a *scan descriptor* (`[generator_name, generator_argument]`), and `step` picks a scan strategy based on the predicates' shape. Predicates are classified into three groups: | kind | examples | strategy passed to the generator | |---|---|---| | position-independent | `[@A="1"]`, `[name()="foo"]`, `[@A=@b]` | `:uniq` — emit deduplicated matching nodes | | simple positional | `[N]`, `[position()=N]`, `[position()>N]`, `[position()<N]` | `[op, value]` — positional scan with one comparison | | complex / position-dependent | `[position()*@A]`, `[last()-1]`, ... | `:nodesets` — fall back to per-anchor nodesets + the previous `evaluate_predicate` pipeline | Mixed predicate lists are split: position-independent predicates *before* the first positional predicate are folded into the node test; predicates *after* it are applied per-node on the result. Each axis can implement zero, one, or all of the three strategies. If a strategy is not implemented, the generator falls back to producing `:nodesets` and the common slow path (`non_optimized_nodesets_select`) handles dedup / positional filtering on flattened nodesets — i.e. the same behavior as before this PR. This pull request adds fast paths for: - `descendant` / `descendant-or-self`: `:uniq` (single DFS with a seen-set; this is what speeds up `//a//a//a//a`) - `ancestor` / `ancestor-or-self`: `:uniq` (parent-chain walk with a seen-set) - `preceding-sibling` / `following-sibling`: `:uniq` and `[op, value]` (sibling scan with anchor-index tracking) Other axes (`child`, `parent`, `self`, `attribute`, `preceding`, `following`, etc) keeps the previous behavior via the fallback path; they can be optimized in follow-ups without changing call sites. ## Detail For `//a//a//a` style queries, the previous code built nodesets keyed by each anchor, including the same descendant once per anchor. The new `:uniq` path scans every node at most once per step. For `[position() > N]` style predicates on wide trees (e.g. `//a/preceding-sibling::*[position()>2]`), we previously built the full preceding-sibling nodeset for each anchor and then ran `evaluate_predicate`. The new `[op, value]` path scans children once per parent and uses anchor-index bookkeeping to recover per-anchor positions. Note: general XPath cannot be linear — e.g. `*[position() * number(@A) % number(@b) = 1]` is genuinely O(n²) — so the goal is only to add a fast-path for specific case: position-independent predicates and simple-positional predicates. ## Benchmark of best case ```ruby DEPTH = 500 xml = '<a>' * DEPTH + '</a>' * DEPTH doc = REXML::Document.new(xml) WIDTH = 1000 xml_wide = '<root>' + '<child/>' * WIDTH + '</root>' doc_wide = REXML::Document.new(xml_wide) REXML::XPath.match(doc, "//a//a"); # processing time: 30.756939s → 0.126807s REXML::XPath.match(doc_wide, "//*/preceding-sibling::*[position()=10]"); # processing time: 2.446333s → 0.083954s ``` ## Benchmark of various case ### Scenario ```yaml prelude: | require "rexml" xml_wide = "<root>" + (1..1000).map { |i| "<item id='#{i}'/>" }.join + "</root>" wide = REXML::Document.new(xml_wide) xml_deep = "<root>" + (1..1000).map { |i| "<item id='#{i}'>" }.join + '</item>'*1000 + "</root>" deep = REXML::Document.new(xml_deep) benchmark: child: REXML::XPath.match(wide, "root/item") descendant: REXML::XPath.match(deep, "//item") descendant-descendant: REXML::XPath.match(deep, "//item//item") descendant-descendant-wildcard: REXML::XPath.match(deep, "//*//*") ancestor-descendant: REXML::XPath.match(deep, "descendant::*/ancestor::*/descendant::*") preceding-following-sibling: REXML::XPath.match(wide, "//*/preceding-sibling::*/following-sibling::*") preceding-following-sibling-positional: REXML::XPath.match(wide, "//*/preceding-sibling::*[10]/following-sibling::*[10]") ``` ### Compares master, xpath_step_optimize (this pull), sort_on_demand(#330), sort_improve(Emulate ideal sort computation time), and its combinations. There's no implementation of `sort_improve` yet, so I used the code below to emulate the computational cost of ideal sort. ```ruby def sort(array_of_nodes) # Just spend time to emulate the ideal computational cost of sorting nodes parents = Set.new.compare_by_identity array_of_nodes.each { parents << it.parent if it.parent } 4.times do # find the common ancestor nodes = array_of_nodes seen = Set.new.compare_by_identity while nodes.size >= 2 new_nodes = Set.new.compare_by_identity nodes.map(&:parent).each do |parent| if parent && !seen.include?(parent) seen << parent new_nodes << parent end end nodes = new_nodes end # iterate each node's siblings parents.each{it.children.each{}} end array_of_nodes # not sorted end ``` ### Result ``` Comparison: child master: 1288.1 i/s master_sort_improve: 1190.4 i/s - 1.08x slower xpath_step_optimize_sort_on_demand_sort_improve: 875.3 i/s - 1.47x slower xpath_step_optimize_sort_improve: 861.3 i/s - 1.50x slower xpath_step_optimize_sort_on_demand: 92.3 i/s - 13.96x slower xpath_step_optimize: 91.7 i/s - 14.05x slower sort_on_demand: 90.6 i/s - 14.21x slower descendant master_sort_improve: 75.5 i/s xpath_step_optimize_sort_on_demand_sort_improve: 75.1 i/s - 1.01x slower xpath_step_optimize_sort_improve: 68.8 i/s - 1.10x slower sort_on_demand: 21.4 i/s - 3.52x slower xpath_step_optimize_sort_on_demand: 21.4 i/s - 3.52x slower master: 20.9 i/s - 3.61x slower xpath_step_optimize: 11.7 i/s - 6.45x slower descendant-descendant xpath_step_optimize_sort_on_demand_sort_improve: 47.5 i/s xpath_step_optimize_sort_improve: 41.9 i/s - 1.13x slower xpath_step_optimize_sort_on_demand: 17.9 i/s - 2.65x slower master_sort_improve: 8.6 i/s - 5.54x slower sort_on_demand: 6.7 i/s - 7.07x slower xpath_step_optimize: 6.1 i/s - 7.84x slower master: 4.6 i/s - 10.24x slower descendant-descendant-wildcard xpath_step_optimize_sort_on_demand_sort_improve: 339.5 i/s xpath_step_optimize_sort_improve: 155.9 i/s - 2.18x slower xpath_step_optimize_sort_on_demand: 26.4 i/s - 12.86x slower master_sort_improve: 10.0 i/s - 33.96x slower sort_on_demand: 7.7 i/s - 44.30x slower xpath_step_optimize: 6.8 i/s - 50.06x slower master: 4.9 i/s - 68.58x slower ancestor-descendant xpath_step_optimize_sort_on_demand_sort_improve: 377.9 i/s xpath_step_optimize_sort_improve: 203.7 i/s - 1.85x slower xpath_step_optimize_sort_on_demand: 26.3 i/s - 14.39x slower xpath_step_optimize: 8.7 i/s - 43.24x slower master_sort_improve: 7.8 i/s - 48.55x slower sort_on_demand: 6.3 i/s - 59.93x slower master: 5.0 i/s - 75.46x slower preceding-following-sibling xpath_step_optimize_sort_on_demand_sort_improve: 684.1 i/s xpath_step_optimize_sort_improve: 424.5 i/s - 1.61x slower xpath_step_optimize_sort_on_demand: 85.8 i/s - 7.98x slower xpath_step_optimize: 23.7 i/s - 28.91x slower master_sort_improve: 20.9 i/s - 32.72x slower sort_on_demand: 19.3 i/s - 35.39x slower master: 13.9 i/s - 49.11x slower preceding-following-sibling-positional xpath_step_optimize_sort_on_demand_sort_improve: 425.4 i/s xpath_step_optimize_sort_improve: 315.0 i/s - 1.35x slower xpath_step_optimize_sort_on_demand: 84.3 i/s - 5.05x slower xpath_step_optimize: 23.3 i/s - 18.22x slower master_sort_improve: 2.1 i/s - 201.38x slower sort_on_demand: 2.1 i/s - 204.75x slower master: 1.9 i/s - 222.08x slower ``` In scenario "child" and "descendant", this PR is slower than master because it adds one additional `sort` call. The difference will be small when `sort` is improved. In most case, this PR itself does not unleash its full potential because sort is the next bottleneck. Combining with `sort` improvement is important. The difference of "descendant-descendant" and "descendant-descendant-wildcard" shows that after optimizing sort, the bottleneck will be namespace lookup in qname check for deeply nested xml.
Contributor
|
@tompng |
Comment on lines
149
to
153
| else | ||
| case object | ||
| when Array | ||
| string(object[0]) | ||
| string(XPathParser.sort(object)[0]) | ||
| when Float |
Comment on lines
+88
to
92
| if node_set.kind_of? Array | ||
| result = [] | ||
| node_set.each do |node| | ||
| XPathParser.sort(node_set.to_a).each do |node| | ||
| result << yield(node) if node.respond_to?(:namespace) | ||
| end |
| if node_set.kind_of? Array | ||
| result = [] | ||
| node_set.each do |node| | ||
| XPathParser.sort(node_set.to_a).each do |node| |
Comment on lines
156
to
162
| result = expr(path_stack, nodeset) | ||
| case result | ||
| when Array # nodeset | ||
| result.uniq | ||
| XPathParser.sort(result) | ||
| else | ||
| [result] | ||
| end |
Comment on lines
621
to
+624
| nodes << node | ||
| end | ||
| end | ||
| new_nodeset = sort(nodes.to_a) | ||
| new_nodeset = nodes.to_a |
Comment on lines
+90
to
+95
| # XPath.match can be a nodeset or a primitive value wrapped in an array. | ||
| # We need to unwrap primitive value because Functions doesn't accept array which is not a nodeset. | ||
| unless matched.all? { |node| node.is_a?(REXML::Node) } | ||
| raise "[BUG] Primitive value should be a single value: #{matched.inspect}" if matched.size != 1 | ||
| matched = matched.first | ||
| end |
Comment on lines
156
to
161
| result = expr(path_stack, nodeset) | ||
| case result | ||
| when Array # nodeset | ||
| result.uniq | ||
| XPathParser.sort(result) | ||
| else | ||
| [result] |
Comment on lines
85
to
93
| if node_set == nil | ||
| yield @context[:node] if @context[:node].respond_to?(:namespace) | ||
| else | ||
| if node_set.respond_to? :each | ||
| if node_set.kind_of? Array | ||
| result = [] | ||
| node_set.each do |node| | ||
| XPathParser.sort(node_set.to_a).each do |node| | ||
| result << yield(node) if node.respond_to?(:namespace) | ||
| end | ||
| result |
| case object | ||
| when Array | ||
| string(object[0]) | ||
| string(XPathParser.sort(object).first) |
Comment on lines
+92
to
+95
| unless matched.all? { |node| node.is_a?(REXML::Node) } | ||
| assert_equal(1, matched.size, 'Primitive value should be a single value') | ||
| matched = matched.first | ||
| end |
Member
Author
|
Resolve done, updated PR description and added some test that fails when |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Delay sorting, only sort when it is needed.
In most case, sorting nodeset is not needed. Sort is only required in:
Number of sort operations
/a/b/c/d/e(a/b/c/d)[position()>1]/e/f/gnumber(/a/b/c/d/e)count(/a/b/c/d/e)//a//b//c//d//e/a[1]/b[1]/c[1]/d[1]/e#315 removed one
nodesets.size == 1optimization path. This pull request will reduce the performance regression.To reduce more sort calls, we need to mark nodeset ordering: introducing
Nodeset = Struct.new(:nodes, :order)but IMO, it shouldn't be done now. If
sortis optimized, one extra sort won't be a problem. Optimizingstepwill be harder and the code may be complicated.Note
This pull request will slightly add complexity and a risk to forgot sorting the nodeset in some path.
The effect may seem drastic in some case for now, but it's just because
sortis currently worstO(n^2). We can improvesortperformance, so there's an option to leave the sort strategy simple.