Optimize ruby visitor #3826

Earlopain · 2025-12-23T14:14:20Z

Looking at things like node.compact_child_nodes.each maybe we should have node.each_child_node? It would avoid the Array allocation and since there is no point to yield nil there is no need to name it node.each_compact_child_node.

This also came up someplace else before but I don't remember where. I agree this makes sense. cc @eregon

compact_child_nodes allocates an array. We can skip that step by simply yielding the nodes.

Benchmark for visiting the rails codebase:

require "prism"
require "benchmark/ips"

files = Dir.glob("../rails/**/*.rb")
results = files.map { Prism.parse_file(it) }
visitor = Prism::Visitor.new

Benchmark.ips do |x|
  x.config(warmup: 3, time: 10)

  x.report do
    results.each do
      visitor.visit(it.value)
    end
  end
end

RubyVM::YJIT.enable

Benchmark.ips do |x|
  x.config(warmup: 3, time: 10)

  x.report do
    results.each do
      visitor.visit(it.value)
    end
  end
end

Before:

ruby 3.4.8 (2025-12-17 revision 995b59f666) +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                          2.691 (± 0.0%) i/s  (371.55 ms/i) -     27.000 in  10.089422s
ruby 3.4.8 (2025-12-17 revision 995b59f666) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                          7.278 (±13.7%) i/s  (137.39 ms/i) -     70.000 in  10.071568s

After:

ruby 3.4.8 (2025-12-17 revision 995b59f666) +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                          3.429 (± 0.0%) i/s  (291.65 ms/i) -     35.000 in  10.208580s
ruby 3.4.8 (2025-12-17 revision 995b59f666) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
                         1.000 i/100ms
Calculating -------------------------------------
                         16.815 (± 0.0%) i/s   (59.47 ms/i) -    169.000 in  10.054668s

~21% faster on the interpreter, ~56% with YJIT

`compact_child_nodes` allocates an array. We can skip that step by simply yielding the nodes. Benchmark for visiting the rails codebase: ```rb require "prism" require "benchmark/ips" files = Dir.glob("../rails/**/*.rb") results = files.map { Prism.parse_file(it) } visitor = Prism::Visitor.new Benchmark.ips do |x| x.config(warmup: 3, time: 10) x.report do results.each do visitor.visit(it.value) end end end RubyVM::YJIT.enable Benchmark.ips do |x| x.config(warmup: 3, time: 10) x.report do results.each do visitor.visit(it.value) end end end ``` Before: ``` ruby 3.4.8 (2025-12-17 revision 995b59f666) +PRISM [x86_64-linux] Warming up -------------------------------------- 1.000 i/100ms Calculating ------------------------------------- 2.691 (± 0.0%) i/s (371.55 ms/i) - 27.000 in 10.089422s ruby 3.4.8 (2025-12-17 revision 995b59f666) +YJIT +PRISM [x86_64-linux] Warming up -------------------------------------- 1.000 i/100ms Calculating ------------------------------------- 7.278 (±13.7%) i/s (137.39 ms/i) - 70.000 in 10.071568s ``` After: ``` ruby 3.4.8 (2025-12-17 revision 995b59f666) +PRISM [x86_64-linux] Warming up -------------------------------------- 1.000 i/100ms Calculating ------------------------------------- 3.429 (± 0.0%) i/s (291.65 ms/i) - 35.000 in 10.208580s ruby 3.4.8 (2025-12-17 revision 995b59f666) +YJIT +PRISM [x86_64-linux] Warming up -------------------------------------- 1.000 i/100ms Calculating ------------------------------------- 16.815 (± 0.0%) i/s (59.47 ms/i) - 169.000 in 10.054668s ``` ~21% faster on the interpreter, ~56% with YJIT

Earlopain · 2025-12-23T14:14:57Z

templates/sig/prism/node.rbs.erb

    def child_nodes: () -> Array[Prism::node?]
    def comment_targets: () -> Array[Prism::node | Location]
    def compact_child_nodes: () -> Array[Prism::node]
+    def each_child_node: () { (Prism::node) -> void } -> void | () -> Enumerator[Prism::node]


Unsure if this is correct. I'm also missing sorbet signatures which I don't know how to write for this

This looks right to me, and it would have failed to typecheck otherwise.

So are sorbet signatures not needed? I don't use these systems so no idea when/where it gets used. Other functions are present so I assume there is a reason for them.

Earlopain · 2025-12-23T14:15:38Z

templates/lib/prism/node.rb.erb

+    end
+
    # def compact_child_nodes: () -> Array[Node]
    def compact_child_nodes


This can just be implemented as each_child_node.to_a but performance is not so great

Nope, you're right to keep this the same.

eregon · 2025-12-30T10:33:38Z

Nice, thank you!
Probably I should revisit https://eregon.me/blog/2024/10/27/benchmarking-ruby-parsers.html with this, Prism would be even faster for Parsing and Walking :)

FWIW in the description you mention

~21% faster on the interpreter, ~56% with YJIT

but the numbers above look much better than that to me, I would say 16.815/7.278 = 2.3x "as fast" / 130% "faster" (and I think "2.3x faster" is clear for most people too).
I guess you counted as 1 - 7.278/16.815 = 56% "less time", which is correct too but maybe confusing together with "faster"?
I'm not sure if there is standard for this 😅
FWIW Benchmark.ips compare! uses "2x faster":

$ ruby -rbenchmark/ips -e 'Benchmark.ips { it.report(:a) { sleep 0.2 }; it.report(:b) { sleep 0.1 }; it.compare!(order: :baseline) }'
ruby 3.4.7 (2025-10-08 revision 7a5688e2a2) +PRISM [x86_64-linux]
Warming up --------------------------------------
                   a     1.000 i/100ms
                   b     1.000 i/100ms
Calculating -------------------------------------
                   a      4.998 (± 0.0%) i/s  (200.08 ms/i) -     25.000 in   5.002088s
                   b      9.984 (± 0.0%) i/s  (100.16 ms/i) -     50.000 in   5.008167s

Comparison:
a:        5.0 i/s
b:       10.0 i/s - 2.00x  faster

eregon · 2025-12-30T10:37:55Z

templates/lib/prism/node.rb.erb

+      <%- when Prism::Template::OptionalNodeField -%>
+      yield <%= field.name %> if <%= field.name %>
+      <%- when Prism::Template::NodeListField -%>
+      <%= field.name %>.each  {|node| yield node }


Suggested change

<%= field.name %>.each {|node| yield node }

<%= field.name %>.each(&block)

might be faster because it avoids an extra block (that literal block) and consumes less stack (both good for performance and to avoid stack overflow errors).
I'll give it a try.

I tried

diff --git i/templates/lib/prism/node.rb.erb w/templates/lib/prism/node.rb.erb index c97c029d3..76cbb2608 100644 --- i/templates/lib/prism/node.rb.erb +++ w/templates/lib/prism/node.rb.erb @@ -343,8 +343,8 @@ module Prism end # def each_child_node: () { (Prism::node) -> void } -> void | () -> Enumerator[Prism::node] - def each_child_node - return to_enum(:each_child_node) unless block_given? + def each_child_node(&_block) + return to_enum(:each_child_node) unless _block <%- node.fields.each do |field| -%> <%- case field -%> @@ -353,7 +353,7 @@ module Prism <%- when Prism::Template::OptionalNodeField -%> yield <%= field.name %> if <%= field.name %> <%- when Prism::Template::NodeListField -%> - <%= field.name %>.each {|node| yield node } + <%= field.name %>.each(&_block) <%- end -%> <%- end -%> end

and it seems a little bit faster in interpreter but much slower in YJIT (2 runs each to get an idea of variance, for some reason the ± 0.0% seems broken in benchmark-ips):

BEFORE: $ ruby -Ilib bench_visitor.rb ruby 4.0.0 (2025-12-25 revision 553f1675f3) +PRISM [x86_64-linux] Warming up -------------------------------------- 1.000 i/100ms Calculating ------------------------------------- 2.720 (± 0.0%) i/s (367.65 ms/i) - 28.000 in 10.294286s ruby 4.0.0 (2025-12-25 revision 553f1675f3) +YJIT +PRISM [x86_64-linux] Warming up -------------------------------------- 1.000 i/100ms Calculating ------------------------------------- 14.286 (± 0.0%) i/s (70.00 ms/i) - 143.000 in 10.009988s $ ruby -Ilib bench_visitor.rb ruby 4.0.0 (2025-12-25 revision 553f1675f3) +PRISM [x86_64-linux] Warming up -------------------------------------- 1.000 i/100ms Calculating ------------------------------------- 2.740 (± 0.0%) i/s (364.94 ms/i) - 28.000 in 10.218390s ruby 4.0.0 (2025-12-25 revision 553f1675f3) +YJIT +PRISM [x86_64-linux] Warming up -------------------------------------- 1.000 i/100ms Calculating ------------------------------------- 14.309 (± 0.0%) i/s (69.88 ms/i) - 144.000 in 10.064832s AFTER: $ ruby -Ilib bench_visitor.rb ruby 4.0.0 (2025-12-25 revision 553f1675f3) +PRISM [x86_64-linux] Warming up -------------------------------------- 1.000 i/100ms Calculating ------------------------------------- 2.790 (± 0.0%) i/s (358.37 ms/i) - 28.000 in 10.034481s ruby 4.0.0 (2025-12-25 revision 553f1675f3) +YJIT +PRISM [x86_64-linux] Warming up -------------------------------------- 1.000 i/100ms Calculating ------------------------------------- 11.968 (± 0.0%) i/s (83.55 ms/i) - 120.000 in 10.026674s $ ruby -Ilib bench_visitor.rb ruby 4.0.0 (2025-12-25 revision 553f1675f3) +PRISM [x86_64-linux] Warming up -------------------------------------- 1.000 i/100ms Calculating ------------------------------------- 2.805 (± 0.0%) i/s (356.53 ms/i) - 29.000 in 10.339397s ruby 4.0.0 (2025-12-25 revision 553f1675f3) +YJIT +PRISM [x86_64-linux] Warming up -------------------------------------- 1.000 i/100ms Calculating ------------------------------------- 12.116 (± 0.0%) i/s (82.54 ms/i) - 122.000 in 10.069442s

Earlopain · 2025-12-31T11:24:27Z

but the numbers above look much better than that to me

Yeah, I could have written that more clearly. benchmark/ips has https://github.com/evanphx/benchmark-ips?tab=readme-ov-file#independent-benchmarking but I was too lazy

Earlopain commented Dec 23, 2025

View reviewed changes

kddnewton approved these changes Dec 29, 2025

View reviewed changes

kddnewton merged commit 4f86847 into ruby:main Dec 29, 2025
64 checks passed

eregon reviewed Dec 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize ruby visitor #3826

Optimize ruby visitor #3826

Earlopain commented Dec 23, 2025

Uh oh!

Earlopain Dec 23, 2025

Uh oh!

kddnewton Dec 29, 2025

Uh oh!

Earlopain Dec 31, 2025

Uh oh!

Earlopain Dec 23, 2025

Uh oh!

kddnewton Dec 29, 2025

Uh oh!

Uh oh!

eregon commented Dec 30, 2025

Uh oh!

eregon Dec 30, 2025

Uh oh!

eregon Dec 31, 2025 •

edited

Loading

Uh oh!

Earlopain commented Dec 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	<%= field.name %>.each {\|node\| yield node }
	<%= field.name %>.each(&block)

Optimize ruby visitor #3826

Optimize ruby visitor #3826

Conversation

Earlopain commented Dec 23, 2025

Uh oh!

Earlopain Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

kddnewton Dec 29, 2025

Choose a reason for hiding this comment

Uh oh!

Earlopain Dec 31, 2025

Choose a reason for hiding this comment

Uh oh!

Earlopain Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

kddnewton Dec 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

eregon commented Dec 30, 2025

Uh oh!

eregon Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

eregon Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Earlopain commented Dec 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

eregon Dec 31, 2025 •

edited

Loading