Skip to content

Sort nodeset on demand#330

Open
tompng wants to merge 1 commit into
ruby:masterfrom
tompng:sort_on_demand
Open

Sort nodeset on demand#330
tompng wants to merge 1 commit into
ruby:masterfrom
tompng:sort_on_demand

Conversation

@tompng

@tompng tompng commented Jun 9, 2026

Copy link
Copy Markdown
Member

Delay sorting, only sort when it is needed.
In most case, sorting nodeset is not needed. Sort is only required in:

  • Final result
  • Creating nodesets(each nodeset should be axis-ordered) from a single nodeset
    • Ideally, this can be skipped if the following predicate is not position-dependent
  • Nodeset passed to a function (first node in document order is used)

Number of sort operations

XPath master(before #315) master(after #315) this PR
/a/b/c/d/e 3 4 1
(a/b/c/d)[position()>1]/e/f/g 5 7 2
number(/a/b/c/d/e) 3 4 1
count(/a/b/c/d/e) 3 4 0
//a//b//c//d//e 8 9 1
/a[1]/b[1]/c[1]/d[1]/e 0 1 1

#315 removed one nodesets.size == 1 optimization path. This pull request will reduce the performance regression.
To reduce more sort calls, we need to mark nodeset ordering: introducing Nodeset = Struct.new(:nodes, :order)
but IMO, it shouldn't be done now. If sort is optimized, one extra sort won't be a problem. Optimizing step will be harder and the code may be complicated.

Note

This pull request will slightly add complexity and a risk to forgot sorting the nodeset in some path.
The effect may seem drastic in some case for now, but it's just because sort is currently worst O(n^2). We can improve sort performance, so there's an option to leave the sort strategy simple.

Copilot AI review requested due to automatic review settings June 9, 2026 17:06

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adjusts REXML XPath evaluation to return consistently ordered node-sets (via centralized sorting) and updates tests to handle XPath primitive results returned as single-element arrays.

Changes:

  • Centralizes ordering by using XPathParser.sort(...) in match and some call sites, and makes sort a class method.
  • Simplifies step(...) by always de-duplicating via identity Set and removing the axis_order parameter.
  • Updates the Jaxen test helper to unwrap primitive XPath results before calling REXML::Functions.string.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
test/test_jaxen.rb Unwraps primitive XPath results (single-element arrays) to keep Functions.string calls compatible.
lib/rexml/xpath_parser.rb Reworks node-set ordering and de-duplication; changes sort to a class method; removes reverse-axis handling in step.
lib/rexml/functions.rb Sorts node-sets before iterating / stringifying to make results deterministic.
Comments suppressed due to low confidence (1)

lib/rexml/xpath_parser.rb:1

  • step no longer sorts the merged node-set (it used to call sort(...) for the multi-nodeset case). Returning nodes.to_a makes ordering dependent on insertion order through Set, which can change predicate behavior where ordering is significant (e.g., later [1] filters or position()), and can introduce non-determinism across Ruby versions/Set behavior. Consider sorting the merged result before returning (and keep de-duplication by identity).
# frozen_string_literal: false

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lib/rexml/xpath_parser.rb
Comment on lines 155 to 161
result = expr(path_stack, nodeset)
case result
when Array # nodeset
result.uniq
XPathParser.sort(result)
else
[result]
end
Comment thread lib/rexml/xpath_parser.rb Outdated
Comment thread lib/rexml/functions.rb Outdated
Comment on lines 87 to 89
XPathParser.sort(node_set.to_a).each do |node|
result << yield(node) if node.respond_to?(:namespace)
end
Comment thread lib/rexml/xpath_parser.rb Outdated
@tompng tompng mentioned this pull request Jun 9, 2026
naitoh pushed a commit that referenced this pull request Jun 14, 2026
Refactor `step` so that nodeset materialization is deferred. Instead of
building the full nodeset up front and filtering through predicates,
each axis returns a *scan descriptor* (`[generator_name,
generator_argument]`), and `step` picks a scan strategy based on the
predicates' shape.

Predicates are classified into three groups:

| kind | examples | strategy passed to the generator |
|---|---|---|
| position-independent | `[@A="1"]`, `[name()="foo"]`, `[@A=@b]` |
`:uniq` — emit deduplicated matching nodes |
| simple positional | `[N]`, `[position()=N]`, `[position()>N]`,
`[position()<N]` | `[op, value]` — positional scan with one comparison |
| complex / position-dependent | `[position()*@A]`, `[last()-1]`, ... |
`:nodesets` — fall back to per-anchor nodesets + the previous
`evaluate_predicate` pipeline |

Mixed predicate lists are split: position-independent predicates
*before* the first positional predicate are folded into the node test;
predicates *after* it are applied per-node on the result.

Each axis can implement zero, one, or all of the three strategies. If a
strategy is not implemented, the generator falls back to producing
`:nodesets` and the common slow path (`non_optimized_nodesets_select`)
handles dedup / positional filtering on flattened nodesets — i.e. the
same behavior as before this PR.

This pull request adds fast paths for:

- `descendant` / `descendant-or-self`: `:uniq` (single DFS with a
seen-set; this is what speeds up `//a//a//a//a`)
- `ancestor` / `ancestor-or-self`: `:uniq` (parent-chain walk with a
seen-set)
- `preceding-sibling` / `following-sibling`: `:uniq` and `[op, value]`
(sibling scan with anchor-index tracking)

Other axes (`child`, `parent`, `self`, `attribute`, `preceding`,
`following`, etc) keeps the previous behavior via the fallback path;
they can be optimized in follow-ups without changing call sites.

## Detail

For `//a//a//a` style queries, the previous code built nodesets keyed by
each anchor, including the same descendant once per anchor. The new
`:uniq` path scans every node at most once per step.

For `[position() > N]` style predicates on wide trees (e.g.
`//a/preceding-sibling::*[position()>2]`), we previously built the full
preceding-sibling nodeset for each anchor and then ran
`evaluate_predicate`. The new `[op, value]` path scans children once per
parent and uses
anchor-index bookkeeping to recover per-anchor positions.

Note: general XPath cannot be linear — e.g. `*[position() * number(@A) %
number(@b) = 1]` is genuinely O(n²) — so the goal is only to add a
fast-path for specific case: position-independent predicates and
simple-positional predicates.

## Benchmark of best case

```ruby
DEPTH = 500
xml   = '<a>' * DEPTH + '</a>' * DEPTH
doc   = REXML::Document.new(xml)
WIDTH = 1000
xml_wide = '<root>' + '<child/>' * WIDTH + '</root>'
doc_wide = REXML::Document.new(xml_wide)

REXML::XPath.match(doc, "//a//a");
# processing time: 30.756939s → 0.126807s

REXML::XPath.match(doc_wide, "//*/preceding-sibling::*[position()=10]");
# processing time: 2.446333s → 0.083954s
```

## Benchmark of various case

### Scenario
```yaml
prelude: |
  require "rexml"
  xml_wide = "<root>" + (1..1000).map { |i| "<item id='#{i}'/>" }.join + "</root>"
  wide = REXML::Document.new(xml_wide)
  xml_deep = "<root>" + (1..1000).map { |i| "<item id='#{i}'>" }.join + '</item>'*1000 + "</root>"
  deep = REXML::Document.new(xml_deep)

benchmark:
  child: REXML::XPath.match(wide, "root/item")
  descendant: REXML::XPath.match(deep, "//item")
  descendant-descendant: REXML::XPath.match(deep, "//item//item")
  descendant-descendant-wildcard: REXML::XPath.match(deep, "//*//*")
  ancestor-descendant: REXML::XPath.match(deep, "descendant::*/ancestor::*/descendant::*")
  preceding-following-sibling: REXML::XPath.match(wide, "//*/preceding-sibling::*/following-sibling::*")
  preceding-following-sibling-positional: REXML::XPath.match(wide, "//*/preceding-sibling::*[10]/following-sibling::*[10]")
```

### Compares
master, xpath_step_optimize (this pull), sort_on_demand(#330),
sort_improve(Emulate ideal sort computation time), and its combinations.

There's no implementation of `sort_improve` yet, so I used the code
below to emulate the computational cost of ideal sort.
```ruby
def sort(array_of_nodes)
  # Just spend time to emulate the ideal computational cost of sorting nodes
  parents = Set.new.compare_by_identity
  array_of_nodes.each { parents << it.parent if it.parent }
  4.times do
    # find the common ancestor
    nodes = array_of_nodes
    seen = Set.new.compare_by_identity
    while nodes.size >= 2
      new_nodes = Set.new.compare_by_identity
      nodes.map(&:parent).each do |parent|
        if parent && !seen.include?(parent)
          seen << parent
          new_nodes << parent
        end
      end
      nodes = new_nodes
    end
    # iterate each node's siblings
    parents.each{it.children.each{}}
  end
  array_of_nodes # not sorted
end
```

### Result
```
Comparison:
                                              child
                                master:      1288.1 i/s 
                   master_sort_improve:      1190.4 i/s - 1.08x  slower
xpath_step_optimize_sort_on_demand_sort_improve:       875.3 i/s - 1.47x  slower
      xpath_step_optimize_sort_improve:       861.3 i/s - 1.50x  slower
    xpath_step_optimize_sort_on_demand:        92.3 i/s - 13.96x  slower
                   xpath_step_optimize:        91.7 i/s - 14.05x  slower
                        sort_on_demand:        90.6 i/s - 14.21x  slower

                                         descendant
                   master_sort_improve:        75.5 i/s 
xpath_step_optimize_sort_on_demand_sort_improve:        75.1 i/s - 1.01x  slower
      xpath_step_optimize_sort_improve:        68.8 i/s - 1.10x  slower
                        sort_on_demand:        21.4 i/s - 3.52x  slower
    xpath_step_optimize_sort_on_demand:        21.4 i/s - 3.52x  slower
                                master:        20.9 i/s - 3.61x  slower
                   xpath_step_optimize:        11.7 i/s - 6.45x  slower

                              descendant-descendant
xpath_step_optimize_sort_on_demand_sort_improve:        47.5 i/s 
      xpath_step_optimize_sort_improve:        41.9 i/s - 1.13x  slower
    xpath_step_optimize_sort_on_demand:        17.9 i/s - 2.65x  slower
                   master_sort_improve:         8.6 i/s - 5.54x  slower
                        sort_on_demand:         6.7 i/s - 7.07x  slower
                   xpath_step_optimize:         6.1 i/s - 7.84x  slower
                                master:         4.6 i/s - 10.24x  slower

                     descendant-descendant-wildcard
xpath_step_optimize_sort_on_demand_sort_improve:       339.5 i/s 
      xpath_step_optimize_sort_improve:       155.9 i/s - 2.18x  slower
    xpath_step_optimize_sort_on_demand:        26.4 i/s - 12.86x  slower
                   master_sort_improve:        10.0 i/s - 33.96x  slower
                        sort_on_demand:         7.7 i/s - 44.30x  slower
                   xpath_step_optimize:         6.8 i/s - 50.06x  slower
                                master:         4.9 i/s - 68.58x  slower

                                ancestor-descendant
xpath_step_optimize_sort_on_demand_sort_improve:       377.9 i/s 
      xpath_step_optimize_sort_improve:       203.7 i/s - 1.85x  slower
    xpath_step_optimize_sort_on_demand:        26.3 i/s - 14.39x  slower
                   xpath_step_optimize:         8.7 i/s - 43.24x  slower
                   master_sort_improve:         7.8 i/s - 48.55x  slower
                        sort_on_demand:         6.3 i/s - 59.93x  slower
                                master:         5.0 i/s - 75.46x  slower

                        preceding-following-sibling
xpath_step_optimize_sort_on_demand_sort_improve:       684.1 i/s 
      xpath_step_optimize_sort_improve:       424.5 i/s - 1.61x  slower
    xpath_step_optimize_sort_on_demand:        85.8 i/s - 7.98x  slower
                   xpath_step_optimize:        23.7 i/s - 28.91x  slower
                   master_sort_improve:        20.9 i/s - 32.72x  slower
                        sort_on_demand:        19.3 i/s - 35.39x  slower
                                master:        13.9 i/s - 49.11x  slower

             preceding-following-sibling-positional
xpath_step_optimize_sort_on_demand_sort_improve:       425.4 i/s 
      xpath_step_optimize_sort_improve:       315.0 i/s - 1.35x  slower
    xpath_step_optimize_sort_on_demand:        84.3 i/s - 5.05x  slower
                   xpath_step_optimize:        23.3 i/s - 18.22x  slower
                   master_sort_improve:         2.1 i/s - 201.38x  slower
                        sort_on_demand:         2.1 i/s - 204.75x  slower
                                master:         1.9 i/s - 222.08x  slower
```

In scenario "child" and "descendant", this PR is slower than master
because it adds one additional `sort` call. The difference will be small
when `sort` is improved.
In most case, this PR itself does not unleash its full potential because
sort is the next bottleneck. Combining with `sort` improvement is
important.
The difference of "descendant-descendant" and
"descendant-descendant-wildcard" shows that after optimizing sort, the
bottleneck will be namespace lookup in qname check for deeply nested
xml.
@naitoh

naitoh commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

@tompng
Could you please resolve the conflicts?

Copilot AI review requested due to automatic review settings June 14, 2026 05:35

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

Comment thread lib/rexml/functions.rb
Comment on lines 149 to 153
else
case object
when Array
string(object[0])
string(XPathParser.sort(object)[0])
when Float
Comment thread lib/rexml/functions.rb
Comment on lines +88 to 92
if node_set.kind_of? Array
result = []
node_set.each do |node|
XPathParser.sort(node_set.to_a).each do |node|
result << yield(node) if node.respond_to?(:namespace)
end
Comment thread lib/rexml/functions.rb Outdated
if node_set.kind_of? Array
result = []
node_set.each do |node|
XPathParser.sort(node_set.to_a).each do |node|
Comment thread lib/rexml/xpath_parser.rb
Comment on lines 156 to 162
result = expr(path_stack, nodeset)
case result
when Array # nodeset
result.uniq
XPathParser.sort(result)
else
[result]
end
Comment thread lib/rexml/xpath_parser.rb
Comment on lines 621 to +624
nodes << node
end
end
new_nodeset = sort(nodes.to_a)
new_nodeset = nodes.to_a
Comment thread test/test_jaxen.rb
Comment on lines +90 to +95
# XPath.match can be a nodeset or a primitive value wrapped in an array.
# We need to unwrap primitive value because Functions doesn't accept array which is not a nodeset.
unless matched.all? { |node| node.is_a?(REXML::Node) }
raise "[BUG] Primitive value should be a single value: #{matched.inspect}" if matched.size != 1
matched = matched.first
end
Copilot AI review requested due to automatic review settings June 14, 2026 05:53

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

Comment thread lib/rexml/xpath_parser.rb
Comment on lines 156 to 161
result = expr(path_stack, nodeset)
case result
when Array # nodeset
result.uniq
XPathParser.sort(result)
else
[result]
Comment thread lib/rexml/functions.rb
Comment on lines 85 to 93
if node_set == nil
yield @context[:node] if @context[:node].respond_to?(:namespace)
else
if node_set.respond_to? :each
if node_set.kind_of? Array
result = []
node_set.each do |node|
XPathParser.sort(node_set.to_a).each do |node|
result << yield(node) if node.respond_to?(:namespace)
end
result
Comment thread lib/rexml/functions.rb
case object
when Array
string(object[0])
string(XPathParser.sort(object).first)
Comment thread test/test_jaxen.rb
Comment on lines +92 to +95
unless matched.all? { |node| node.is_a?(REXML::Node) }
assert_equal(1, matched.size, 'Primitive value should be a single value')
matched = matched.first
end
@tompng

tompng commented Jun 14, 2026

Copy link
Copy Markdown
Member Author

Resolve done, updated PR description and added some test that fails when sort is not performed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants