- Start a new branch.
- Build a JVM evaluator benchmark tool similar to
cli/src/main/scala/dev/bosatsu/LibCheckProfileMain.scala. - Benchmark must exercise:
- function call
- tail loops
- non-tail recursion
- integer operations
- string operations
- Support looped execution so we can attach async-profiler (
~/Downloads/async-profiler-4.3-macos/bin/asprof). - Iteratively:
- identify evaluator performance hotspots
- make one improvement
- run
sbt coreJVM/testOnly dev.bosatsu.EvaluationTest - if green, checkpoint commit
- repeat until no significant improvements remain
- Track each checkpoint in this document with relative performance changes.
- Do a final comparison against the baseline and summarize total gains.
- Add
EvalBenchmarkMaininclito compile a fixed Bosatsu package once and repeatedly execute all five workload categories. - Record baseline throughput and async-profiler hotspots using the new benchmark.
- Apply one evaluator-focused optimization at a time.
- After each change:
- re-run benchmark for relative delta
- run
sbt coreJVM/testOnly dev.bosatsu.EvaluationTest - checkpoint commit and log the change here
- Stop when additional changes are neutral/regressive or too risky for marginal gain.
- Run a final baseline-vs-final comparison and summarize gains.
- Command:
sbt "cli/runMain dev.bosatsu.EvalBenchmarkMain --once --warmup 800 --iterations 2000 --work 300 --non-tail-depth 180" - Throughput:
764.40 iters/s(1,308,217 ns/iter) - Top async-profiler observations:
- Large time in interface/virtual dispatch stubs (
itable stub,vtable stub), consistent with dynamic function/value dispatch. - Hot frames in evaluator scope and closures (
MatchlessToValue$Impl$Dynamic.apply,MatchlessToValue$Impl$Env$$Lambda...). - Noticeable integer overhead (
java.math.BigInteger.getInt,scala.math.BigInt$.apply,java.math.BigInteger.subtract). - Frequent map/identifier equality work (
Identifier.equals,String.equals,LongMap.get/updated/apply).
- Large time in interface/virtual dispatch stubs (
- Change:
- Added
EvalBenchmarkMainto repeatedly exercise function calls, tail loops, non-tail recursion, integer ops, and string ops. - Added
loopmode and printed PID guidance forasprofattachment. - Switched evaluator local bindings from immutable
Mapupdates/lookups to a lightweight linked local env (LocalEnv) inMatchlessToValue.Scope. - Added integer union representation plumbing (
java.lang.Integer | java.math.BigInteger) inValue, with Predef fast paths for common integer arithmetic/bitwise operations and int-aware literal/nat equality in evaluator boolean checks. - Reduced product construction overhead by avoiding
NonEmptyList -> List -> Arrayconversion inmakeCons.
- Added
- Relative perf vs previous:
- Long-running loop mode baseline (before):
~839 iters/s(steady-state from--loop --warmup 50 --report-every 200). - Long-running loop mode after checkpoint:
~920 iters/ssteady-state on the same loop command. - Relative change:
~+9.7%.
- Long-running loop mode baseline (before):
- Test status:
sbt "coreJVM/testOnly dev.bosatsu.EvaluationTest": passed (71/71).
- Change:
- Small integer fast-path cleanup in
Predefarithmetic (addInt,subInt,mulInt) to avoid duplicatelongValueextraction in overflow checks. - JSON runtime compatibility fix for new int representation in
ValueToJson:Bosatsu/Json::JIntencoding now accepts bothjava.lang.Integerandjava.math.BigInteger.- JSON integer decoding for
IntandBosatsu/Json::JIntnow usesVInt(...)to normalize into the mixed int representation.
- Small integer fast-path cleanup in
- Relative perf vs previous:
- No measured throughput change (correctness/cleanup checkpoint).
- Test status:
sbt coreJVM/test: passed (1368 passed, 0 failed).sbt cli/test: passed (57 passed, 0 failed).
- Change:
- Not used.
- Relative perf vs previous:
- Not used.
- Test status:
- Not used.
- Baseline throughput:
~839 iters/ssteady-state (--loop --warmup 50 --report-every 200).
- Final throughput:
~920 iters/ssteady-state on the same loop command.
- Net change:
~+9.7%.
- Notes on remaining hotspots:
- Dispatch overhead (
itable/vtablestubs) remains the largest bucket. - Remaining evaluator overhead is concentrated in
LongMapoperations for anon/mut slots and dynamic closure dispatch. - Further gains likely require deeper structural changes (slot storage layout and dispatch specialization), which are higher risk than this checkpoint.
- Dispatch overhead (