Sort nodeset on demand by tompng · Pull Request #330 · ruby/rexml

tompng · 2026-06-09T17:06:08Z

Delay sorting, only sort when it is needed.
In most case, sorting nodeset is not needed. Sort is only required in:

Final result
Creating nodesets(each nodeset should be axis-ordered) from a single nodeset
- Ideally, this can be skipped if the following predicate is not position-dependent
Nodeset passed to a function (first node in document order is used)

Number of sort operations

XPath	master(before #315)	master(after #315)	this PR
`/a/b/c/d/e`	3	4	1
`(a/b/c/d)[position()>1]/e/f/g`	5	7	2
`number(/a/b/c/d/e)`	3	4	1
`count(/a/b/c/d/e)`	3	4	0
`//a//b//c//d//e`	8	9	1
`/a[1]/b[1]/c[1]/d[1]/e`	0	1	1

#315 removed one nodesets.size == 1 optimization path. This pull request will reduce the performance regression.
To reduce more sort calls, we need to mark nodeset ordering: introducing Nodeset = Struct.new(:nodes, :order)
but IMO, it shouldn't be done now. If sort is optimized, one extra sort won't be a problem. Optimizing step will be harder and the code may be complicated.

Note

This pull request will slightly add complexity and a risk to forgot sorting the nodeset in some path.
The effect may seem drastic in some case for now, but it's just because sort is currently worst O(n^2). We can improve sort performance, so there's an option to leave the sort strategy simple.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adjusts REXML XPath evaluation to return consistently ordered node-sets (via centralized sorting) and updates tests to handle XPath primitive results returned as single-element arrays.

Changes:

Centralizes ordering by using XPathParser.sort(...) in match and some call sites, and makes sort a class method.
Simplifies step(...) by always de-duplicating via identity Set and removing the axis_order parameter.
Updates the Jaxen test helper to unwrap primitive XPath results before calling REXML::Functions.string.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File	Description
test/test_jaxen.rb	Unwraps primitive XPath results (single-element arrays) to keep `Functions.string` calls compatible.
lib/rexml/xpath_parser.rb	Reworks node-set ordering and de-duplication; changes `sort` to a class method; removes reverse-axis handling in `step`.
lib/rexml/functions.rb	Sorts node-sets before iterating / stringifying to make results deterministic.

Comments suppressed due to low confidence (1)

lib/rexml/xpath_parser.rb:1

step no longer sorts the merged node-set (it used to call sort(...) for the multi-nodeset case). Returning nodes.to_a makes ordering dependent on insertion order through Set, which can change predicate behavior where ordering is significant (e.g., later [1] filters or position()), and can introduce non-determinism across Ruby versions/Set behavior. Consider sorting the merged result before returning (and keep de-duplication by identity).

# frozen_string_literal: false

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

      result = expr(path_stack, nodeset)
      case result
      when Array # nodeset
-        result.uniq
+        XPathParser.sort(result)
      else
        [result]
      end


+          XPathParser.sort(node_set.to_a).each do |node|
            result << yield(node) if node.respond_to?(:namespace)
          end


@A

Refactor `step` so that nodeset materialization is deferred. Instead of building the full nodeset up front and filtering through predicates, each axis returns a *scan descriptor* (`[generator_name, generator_argument]`), and `step` picks a scan strategy based on the predicates' shape. Predicates are classified into three groups: | kind | examples | strategy passed to the generator | |---|---|---| | position-independent | `[@A="1"]`, `[name()="foo"]`, `[@A=@b]` | `:uniq` — emit deduplicated matching nodes | | simple positional | `[N]`, `[position()=N]`, `[position()>N]`, `[position()<N]` | `[op, value]` — positional scan with one comparison | | complex / position-dependent | `[position()*@A]`, `[last()-1]`, ... | `:nodesets` — fall back to per-anchor nodesets + the previous `evaluate_predicate` pipeline | Mixed predicate lists are split: position-independent predicates *before* the first positional predicate are folded into the node test; predicates *after* it are applied per-node on the result. Each axis can implement zero, one, or all of the three strategies. If a strategy is not implemented, the generator falls back to producing `:nodesets` and the common slow path (`non_optimized_nodesets_select`) handles dedup / positional filtering on flattened nodesets — i.e. the same behavior as before this PR. This pull request adds fast paths for: - `descendant` / `descendant-or-self`: `:uniq` (single DFS with a seen-set; this is what speeds up `//a//a//a//a`) - `ancestor` / `ancestor-or-self`: `:uniq` (parent-chain walk with a seen-set) - `preceding-sibling` / `following-sibling`: `:uniq` and `[op, value]` (sibling scan with anchor-index tracking) Other axes (`child`, `parent`, `self`, `attribute`, `preceding`, `following`, etc) keeps the previous behavior via the fallback path; they can be optimized in follow-ups without changing call sites. ## Detail For `//a//a//a` style queries, the previous code built nodesets keyed by each anchor, including the same descendant once per anchor. The new `:uniq` path scans every node at most once per step. For `[position() > N]` style predicates on wide trees (e.g. `//a/preceding-sibling::*[position()>2]`), we previously built the full preceding-sibling nodeset for each anchor and then ran `evaluate_predicate`. The new `[op, value]` path scans children once per parent and uses anchor-index bookkeeping to recover per-anchor positions. Note: general XPath cannot be linear — e.g. `*[position() * number(@A) % number(@b) = 1]` is genuinely O(n²) — so the goal is only to add a fast-path for specific case: position-independent predicates and simple-positional predicates. ## Benchmark of best case ```ruby DEPTH = 500 xml = '<a>' * DEPTH + '</a>' * DEPTH doc = REXML::Document.new(xml) WIDTH = 1000 xml_wide = '<root>' + '<child/>' * WIDTH + '</root>' doc_wide = REXML::Document.new(xml_wide) REXML::XPath.match(doc, "//a//a"); # processing time: 30.756939s → 0.126807s REXML::XPath.match(doc_wide, "//*/preceding-sibling::*[position()=10]"); # processing time: 2.446333s → 0.083954s ``` ## Benchmark of various case ### Scenario ```yaml prelude: | require "rexml" xml_wide = "<root>" + (1..1000).map { |i| "<item id='#{i}'/>" }.join + "</root>" wide = REXML::Document.new(xml_wide) xml_deep = "<root>" + (1..1000).map { |i| "<item id='#{i}'>" }.join + '</item>'*1000 + "</root>" deep = REXML::Document.new(xml_deep) benchmark: child: REXML::XPath.match(wide, "root/item") descendant: REXML::XPath.match(deep, "//item") descendant-descendant: REXML::XPath.match(deep, "//item//item") descendant-descendant-wildcard: REXML::XPath.match(deep, "//*//*") ancestor-descendant: REXML::XPath.match(deep, "descendant::*/ancestor::*/descendant::*") preceding-following-sibling: REXML::XPath.match(wide, "//*/preceding-sibling::*/following-sibling::*") preceding-following-sibling-positional: REXML::XPath.match(wide, "//*/preceding-sibling::*[10]/following-sibling::*[10]") ``` ### Compares master, xpath_step_optimize (this pull), sort_on_demand(#330), sort_improve(Emulate ideal sort computation time), and its combinations. There's no implementation of `sort_improve` yet, so I used the code below to emulate the computational cost of ideal sort. ```ruby def sort(array_of_nodes) # Just spend time to emulate the ideal computational cost of sorting nodes parents = Set.new.compare_by_identity array_of_nodes.each { parents << it.parent if it.parent } 4.times do # find the common ancestor nodes = array_of_nodes seen = Set.new.compare_by_identity while nodes.size >= 2 new_nodes = Set.new.compare_by_identity nodes.map(&:parent).each do |parent| if parent && !seen.include?(parent) seen << parent new_nodes << parent end end nodes = new_nodes end # iterate each node's siblings parents.each{it.children.each{}} end array_of_nodes # not sorted end ``` ### Result ``` Comparison: child master: 1288.1 i/s master_sort_improve: 1190.4 i/s - 1.08x slower xpath_step_optimize_sort_on_demand_sort_improve: 875.3 i/s - 1.47x slower xpath_step_optimize_sort_improve: 861.3 i/s - 1.50x slower xpath_step_optimize_sort_on_demand: 92.3 i/s - 13.96x slower xpath_step_optimize: 91.7 i/s - 14.05x slower sort_on_demand: 90.6 i/s - 14.21x slower descendant master_sort_improve: 75.5 i/s xpath_step_optimize_sort_on_demand_sort_improve: 75.1 i/s - 1.01x slower xpath_step_optimize_sort_improve: 68.8 i/s - 1.10x slower sort_on_demand: 21.4 i/s - 3.52x slower xpath_step_optimize_sort_on_demand: 21.4 i/s - 3.52x slower master: 20.9 i/s - 3.61x slower xpath_step_optimize: 11.7 i/s - 6.45x slower descendant-descendant xpath_step_optimize_sort_on_demand_sort_improve: 47.5 i/s xpath_step_optimize_sort_improve: 41.9 i/s - 1.13x slower xpath_step_optimize_sort_on_demand: 17.9 i/s - 2.65x slower master_sort_improve: 8.6 i/s - 5.54x slower sort_on_demand: 6.7 i/s - 7.07x slower xpath_step_optimize: 6.1 i/s - 7.84x slower master: 4.6 i/s - 10.24x slower descendant-descendant-wildcard xpath_step_optimize_sort_on_demand_sort_improve: 339.5 i/s xpath_step_optimize_sort_improve: 155.9 i/s - 2.18x slower xpath_step_optimize_sort_on_demand: 26.4 i/s - 12.86x slower master_sort_improve: 10.0 i/s - 33.96x slower sort_on_demand: 7.7 i/s - 44.30x slower xpath_step_optimize: 6.8 i/s - 50.06x slower master: 4.9 i/s - 68.58x slower ancestor-descendant xpath_step_optimize_sort_on_demand_sort_improve: 377.9 i/s xpath_step_optimize_sort_improve: 203.7 i/s - 1.85x slower xpath_step_optimize_sort_on_demand: 26.3 i/s - 14.39x slower xpath_step_optimize: 8.7 i/s - 43.24x slower master_sort_improve: 7.8 i/s - 48.55x slower sort_on_demand: 6.3 i/s - 59.93x slower master: 5.0 i/s - 75.46x slower preceding-following-sibling xpath_step_optimize_sort_on_demand_sort_improve: 684.1 i/s xpath_step_optimize_sort_improve: 424.5 i/s - 1.61x slower xpath_step_optimize_sort_on_demand: 85.8 i/s - 7.98x slower xpath_step_optimize: 23.7 i/s - 28.91x slower master_sort_improve: 20.9 i/s - 32.72x slower sort_on_demand: 19.3 i/s - 35.39x slower master: 13.9 i/s - 49.11x slower preceding-following-sibling-positional xpath_step_optimize_sort_on_demand_sort_improve: 425.4 i/s xpath_step_optimize_sort_improve: 315.0 i/s - 1.35x slower xpath_step_optimize_sort_on_demand: 84.3 i/s - 5.05x slower xpath_step_optimize: 23.3 i/s - 18.22x slower master_sort_improve: 2.1 i/s - 201.38x slower sort_on_demand: 2.1 i/s - 204.75x slower master: 1.9 i/s - 222.08x slower ``` In scenario "child" and "descendant", this PR is slower than master because it adds one additional `sort` call. The difference will be small when `sort` is improved. In most case, this PR itself does not unleash its full potential because sort is the next bottleneck. Combining with `sort` improvement is important. The difference of "descendant-descendant" and "descendant-descendant-wildcard" shows that after optimizing sort, the bottleneck will be namespace lookup in qname check for deeply nested xml.

naitoh · 2026-06-14T04:32:48Z

@tompng
Could you please resolve the conflicts?

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

      else
        case object
        when Array
-          string(object[0])
+          string(XPathParser.sort(object)[0])
        when Float


+        if node_set.kind_of? Array
          result = []
-          node_set.each do |node|
+          XPathParser.sort(node_set.to_a).each do |node|
            result << yield(node) if node.respond_to?(:namespace)
          end


+        if node_set.kind_of? Array
          result = []
-          node_set.each do |node|
+          XPathParser.sort(node_set.to_a).each do |node|


      result = expr(path_stack, nodeset)
      case result
      when Array # nodeset
-        result.uniq
+        XPathParser.sort(result)
      else
        [result]
      end


            nodes << node
          end
        end
-        new_nodeset = sort(nodes.to_a)
+        new_nodeset = nodes.to_a


+      # XPath.match can be a nodeset or a primitive value wrapped in an array.
+      # We need to unwrap primitive value because Functions doesn't accept array which is not a nodeset.
+      unless matched.all? { |node| node.is_a?(REXML::Node) }
+        raise "[BUG] Primitive value should be a single value: #{matched.inspect}" if matched.size != 1
+        matched = matched.first
+      end


Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

      result = expr(path_stack, nodeset)
      case result
      when Array # nodeset
-        result.uniq
+        XPathParser.sort(result)
      else
        [result]


      if node_set == nil
        yield @context[:node] if @context[:node].respond_to?(:namespace)
      else
-        if node_set.respond_to? :each
+        if node_set.kind_of? Array
          result = []
-          node_set.each do |node|
+          XPathParser.sort(node_set.to_a).each do |node|
            result << yield(node) if node.respond_to?(:namespace)
          end
          result


        case object
        when Array
-          string(object[0])
+          string(XPathParser.sort(object).first)


+      unless matched.all? { |node| node.is_a?(REXML::Node) }
+        assert_equal(1, matched.size, 'Primitive value should be a single value')
+        matched = matched.first
+      end


tompng · 2026-06-14T06:01:30Z

Resolve done, updated PR description and added some test that fails when sort is not performed.

Copilot AI review requested due to automatic review settings June 9, 2026 17:06

Copilot AI reviewed Jun 9, 2026

View reviewed changes

tompng mentioned this pull request Jun 9, 2026

Optimize XPath step #315

Merged

tompng force-pushed the sort_on_demand branch from c90feec to ea2bf33 Compare June 14, 2026 05:05

Copilot AI review requested due to automatic review settings June 14, 2026 05:35

tompng force-pushed the sort_on_demand branch from ea2bf33 to 1c06326 Compare June 14, 2026 05:35

Copilot AI reviewed Jun 14, 2026

View reviewed changes

tompng force-pushed the sort_on_demand branch from 1c06326 to e169d11 Compare June 14, 2026 05:49

Copilot AI review requested due to automatic review settings June 14, 2026 05:53

tompng force-pushed the sort_on_demand branch from e169d11 to db86106 Compare June 14, 2026 05:53

Copilot AI reviewed Jun 14, 2026

View reviewed changes

Sort nodeset on demand

789935f

tompng force-pushed the sort_on_demand branch from db86106 to 789935f Compare June 14, 2026 05:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sort nodeset on demand#330

Sort nodeset on demand#330
tompng wants to merge 1 commit into
ruby:masterfrom
tompng:sort_on_demand

tompng commented Jun 9, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

naitoh commented Jun 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

tompng commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tompng commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Number of sort operations

Note

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

naitoh commented Jun 14, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

tompng commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tompng commented Jun 9, 2026 •

edited

Loading