Skip to content

Conversation

@Earlopain
Copy link
Collaborator

#3808 (comment)

Looking at things like node.compact_child_nodes.each maybe we should have node.each_child_node? It would avoid the Array allocation and since there is no point to yield nil there is no need to name it node.each_compact_child_node.

This also came up someplace else before but I don't remember where. I agree this makes sense. cc @eregon


compact_child_nodes allocates an array. We can skip that step by simply yielding the nodes.

Benchmark for visiting the rails codebase:

require "prism"
require "benchmark/ips"

files = Dir.glob("../rails/**/*.rb")
results = files.map { Prism.parse_file(it) }
visitor = Prism::Visitor.new

Benchmark.ips do |x|
  x.config(warmup: 3, time: 10)

  x.report do
    results.each do
      visitor.visit(it.value)
    end
  end
end

RubyVM::YJIT.enable

Benchmark.ips do |x|
  x.config(warmup: 3, time: 10)

  x.report do
    results.each do
      visitor.visit(it.value)
    end
  end
end

Before:

ruby 3.4.8 (2025-12-17 revision 995b59f666) +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                          2.691 (± 0.0%) i/s  (371.55 ms/i) -     27.000 in  10.089422s
ruby 3.4.8 (2025-12-17 revision 995b59f666) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                          7.278 (±13.7%) i/s  (137.39 ms/i) -     70.000 in  10.071568s

After:

ruby 3.4.8 (2025-12-17 revision 995b59f666) +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                          3.429 (± 0.0%) i/s  (291.65 ms/i) -     35.000 in  10.208580s
ruby 3.4.8 (2025-12-17 revision 995b59f666) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                         16.815 (± 0.0%) i/s   (59.47 ms/i) -    169.000 in  10.054668s

~21% faster on the interpreter, ~56% with YJIT

`compact_child_nodes` allocates an array. We can skip that step by simply yielding the nodes.

Benchmark for visiting the rails codebase:

```rb
require "prism"
require "benchmark/ips"

files = Dir.glob("../rails/**/*.rb")
results = files.map { Prism.parse_file(it) }
visitor = Prism::Visitor.new

Benchmark.ips do |x|
  x.config(warmup: 3, time: 10)

  x.report do
    results.each do
      visitor.visit(it.value)
    end
  end
end

RubyVM::YJIT.enable

Benchmark.ips do |x|
  x.config(warmup: 3, time: 10)

  x.report do
    results.each do
      visitor.visit(it.value)
    end
  end
end
```

Before:
```
ruby 3.4.8 (2025-12-17 revision 995b59f666) +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                          2.691 (± 0.0%) i/s  (371.55 ms/i) -     27.000 in  10.089422s
ruby 3.4.8 (2025-12-17 revision 995b59f666) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                          7.278 (±13.7%) i/s  (137.39 ms/i) -     70.000 in  10.071568s
```
After:
```
ruby 3.4.8 (2025-12-17 revision 995b59f666) +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                          3.429 (± 0.0%) i/s  (291.65 ms/i) -     35.000 in  10.208580s
ruby 3.4.8 (2025-12-17 revision 995b59f666) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                         16.815 (± 0.0%) i/s   (59.47 ms/i) -    169.000 in  10.054668s
```

~21% faster on the interpreter, ~56% with YJIT
def child_nodes: () -> Array[Prism::node?]
def comment_targets: () -> Array[Prism::node | Location]
def compact_child_nodes: () -> Array[Prism::node]
def each_child_node: () { (Prism::node) -> void } -> void | () -> Enumerator[Prism::node]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure if this is correct. I'm also missing sorbet signatures which I don't know how to write for this

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks right to me, and it would have failed to typecheck otherwise.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So are sorbet signatures not needed? I don't use these systems so no idea when/where it gets used. Other functions are present so I assume there is a reason for them.

end
# def compact_child_nodes: () -> Array[Node]
def compact_child_nodes
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can just be implemented as each_child_node.to_a but performance is not so great

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, you're right to keep this the same.

@kddnewton kddnewton merged commit 4f86847 into ruby:main Dec 29, 2025
64 checks passed
@eregon
Copy link
Member

eregon commented Dec 30, 2025

Nice, thank you!
Probably I should revisit https://eregon.me/blog/2024/10/27/benchmarking-ruby-parsers.html with this, Prism would be even faster for Parsing and Walking :)


FWIW in the description you mention

~21% faster on the interpreter, ~56% with YJIT

but the numbers above look much better than that to me, I would say 16.815/7.278 = 2.3x "as fast" / 130% "faster" (and I think "2.3x faster" is clear for most people too).
I guess you counted as 1 - 7.278/16.815 = 56% "less time", which is correct too but maybe confusing together with "faster"?
I'm not sure if there is standard for this 😅
FWIW Benchmark.ips compare! uses "2x faster":

$ ruby -rbenchmark/ips -e 'Benchmark.ips { it.report(:a) { sleep 0.2 }; it.report(:b) { sleep 0.1 }; it.compare!(order: :baseline) }'
ruby 3.4.7 (2025-10-08 revision 7a5688e2a2) +PRISM [x86_64-linux]
Warming up --------------------------------------
                   a     1.000 i/100ms
                   b     1.000 i/100ms
Calculating -------------------------------------
                   a      4.998 (± 0.0%) i/s  (200.08 ms/i) -     25.000 in   5.002088s
                   b      9.984 (± 0.0%) i/s  (100.16 ms/i) -     50.000 in   5.008167s

Comparison:
a:        5.0 i/s
b:       10.0 i/s - 2.00x  faster

<%- when Prism::Template::OptionalNodeField -%>
yield <%= field.name %> if <%= field.name %>
<%- when Prism::Template::NodeListField -%>
<%= field.name %>.each {|node| yield node }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<%= field.name %>.each {|node| yield node }
<%= field.name %>.each(&block)

might be faster because it avoids an extra block (that literal block) and consumes less stack (both good for performance and to avoid stack overflow errors).
I'll give it a try.

Copy link
Member

@eregon eregon Dec 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried

diff --git i/templates/lib/prism/node.rb.erb w/templates/lib/prism/node.rb.erb
index c97c029d3..76cbb2608 100644
--- i/templates/lib/prism/node.rb.erb
+++ w/templates/lib/prism/node.rb.erb
@@ -343,8 +343,8 @@ module Prism
     end
 
     # def each_child_node: () { (Prism::node) -> void } -> void | () -> Enumerator[Prism::node]
-    def each_child_node
-      return to_enum(:each_child_node) unless block_given?
+    def each_child_node(&_block)
+      return to_enum(:each_child_node) unless _block
 
       <%- node.fields.each do |field| -%>
       <%- case field -%>
@@ -353,7 +353,7 @@ module Prism
       <%- when Prism::Template::OptionalNodeField -%>
       yield <%= field.name %> if <%= field.name %>
       <%- when Prism::Template::NodeListField -%>
-      <%= field.name %>.each  {|node| yield node }
+      <%= field.name %>.each(&_block)
       <%- end -%>
       <%- end -%>
     end

and it seems a little bit faster in interpreter but much slower in YJIT (2 runs each to get an idea of variance, for some reason the ± 0.0% seems broken in benchmark-ips):

BEFORE:
$ ruby -Ilib bench_visitor.rb
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                          2.720 (± 0.0%) i/s  (367.65 ms/i) -     28.000 in  10.294286s
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                         14.286 (± 0.0%) i/s   (70.00 ms/i) -    143.000 in  10.009988s

$ ruby -Ilib bench_visitor.rb
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                          2.740 (± 0.0%) i/s  (364.94 ms/i) -     28.000 in  10.218390s
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                         14.309 (± 0.0%) i/s   (69.88 ms/i) -    144.000 in  10.064832s

AFTER:
$ ruby -Ilib bench_visitor.rb
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                          2.790 (± 0.0%) i/s  (358.37 ms/i) -     28.000 in  10.034481s
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                         11.968 (± 0.0%) i/s   (83.55 ms/i) -    120.000 in  10.026674s

$ ruby -Ilib bench_visitor.rb
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                          2.805 (± 0.0%) i/s  (356.53 ms/i) -     29.000 in  10.339397s
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                         12.116 (± 0.0%) i/s   (82.54 ms/i) -    122.000 in  10.069442s

@Earlopain
Copy link
Collaborator Author

but the numbers above look much better than that to me

Yeah, I could have written that more clearly. benchmark/ips has https://github.com/evanphx/benchmark-ips?tab=readme-ov-file#independent-benchmarking but I was too lazy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants