Skip to content

Implement benchmarks for Ark.jl#83

Merged
ameligrana merged 17 commits into
mainfrom
ark
Feb 2, 2026
Merged

Implement benchmarks for Ark.jl#83
ameligrana merged 17 commits into
mainfrom
ark

Conversation

@ameligrana

@ameligrana ameligrana commented Feb 1, 2026

Copy link
Copy Markdown
Member

still need to verify some details, but this implements the benchmarks for Ark.jl. Mostly since Ark doesn't have spatial facilities backed in (even if they can be with not too difficulty be backed in some extension) the code is 2-3x longer than the Agents.jl one, though the performance is better (except for Schelling, which is probably too simple to benefit).

@ameligrana

Copy link
Copy Markdown
Member Author

final results:

Screenshot from 2026-02-01 16-17-42

actually Ark is faster in all cases, there were some wrong params before

@ameligrana

ameligrana commented Feb 1, 2026

Copy link
Copy Markdown
Member Author

if you want to take a look @Datseris, this should be correct, I tested that both simulators produce identical results on a set of features

@Datseris

Datseris commented Feb 1, 2026

Copy link
Copy Markdown
Member

I am trying to understand the flocking model, in particular how searching for nearest neighbors work. Do I understand correctly that you implemented, effectively from scrattch, a rudimentary version of what the Agents.jl continuous space does, where it tracks agents in discretized cell grids?

Comment thread Flocking/Ark/Flocking.jl Outdated
Comment on lines +107 to +112
for dr in -radius:radius, dc in -radius:radius
r, c = mod1(row + dr, grid.rows), mod1(col + dc, grid.cols)
for neighbor_entity in grid.entities[r, c]
if neighbor_entity == entity
continue
end

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering whether this an accurate reflection of what the Agents.jl code (and I assume other frameworks) is doing? It is based on euclidean distance which requires at a minimum some additional filtering. I don't think it would change the benchmarks a lot, but it is good to make sure every code does the same.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agents uses approximate search in flocking at the moment which corresponds to this

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also Mason actually uses approximate search so, it should be okay. Probably we could change to exact for NetLogo in the future

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agents uses approximate search in flocking at the moment which corresponds to this

That is not how I have understood continuous space searches so far. I think what is happening is that cells are still selected using euclidean distance. The approximate part of Agents.jl means that agents do not get further filtered according to actual euclidean distance, but as long as the cell is selected it is accurate enough. I thought MASON also uses Euclidean distance. Your code above uses Chebyshev distance.

In any case this won't have any performance penalty to Ark, as cells can be preselected via offset. But doing really the same thing would complicate the code a bit more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah right, will update this, thank you for the catch

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, should be done

@Datseris

Datseris commented Feb 1, 2026

Copy link
Copy Markdown
Member

I am curious to understand where such a drastic change in performance is coming from. My understand was that the retrieval of properties in Ark.jl would be much faster due to the fundamentally different data layouts. But, beyond that,I don't see what other things would be done much faster. At least looking at the code the remaining operations are done in a failry similar "object-oriented-like" way where we are iterating over e.g., every neighbor (or the element of a given property that corresponds to neighboring IDs). Would really only this change lead to a 2-4x performance difference?

With the previous frameworks, there was the difference in programming language, as well as the new ways Agents.jl was implementing algorithms such as neighborhood searches (at least vs. Mesa, MASON has the same neighborhoods as agents). But here I assume the algorithms are the same, e.g., for random sampling as you wrote them for both frameworks. So I am not sure where this performance change is coming from, if you have any ideas.

@ameligrana

Copy link
Copy Markdown
Member Author

I am trying to understand the flocking model, in particular how searching for nearest neighbors work. Do I understand correctly that you implemented, effectively from scrattch, a rudimentary version of what the Agents.jl continuous space does, where it tracks agents in discretized cell grids?

Yes, exactly. It should be equivalent to the Agents.jl internals.

@ameligrana

ameligrana commented Feb 1, 2026

Copy link
Copy Markdown
Member Author

The data layout has a major effect so I'm not that surprised by the results. Actually, these models are not even ideal for the data layout of Ark, which privileges linear iteration, rather than random, and that it is much more effective when the simulation is much more heterogeneous (or even just bigger in terms of fields for the agents) than this. I think in more interesting models, Ark could be 1-2 order of magnitudes faster than Agents on a single CPU because of this.

Though, in this case I think two major effects are at play:

  • the data layout used by Ark uses the cache better, because it uses only the data needed in each loop. This can have a big effect.

  • For WolfSheep, Ark doesn't use any Union/branches for the "types", and it uses a better grid, with only the sheeps on it since wolves do not need to be there.

@ameligrana

Copy link
Copy Markdown
Member Author

do you see any stopper @Datseris ? Otherwise I think it should be mergeable

@Datseris

Datseris commented Feb 2, 2026

Copy link
Copy Markdown
Member

no all good you can merge this!

Just to double check: have you compared the output of the simulations, e.g., the data collection of wolf and sheep over time, to confirm that this new code does the same and populations are stable? I am not sure about the arguments claiming overs of magnitude of performance increase in more complex models. If anything, the more complex the model, the less the agent retrieval should matter? As this a fixed cost (per agent) that does not scale with the model complexity. So I would expect less performance deficit in more complex simulations.

I also just noticed that the Agents.jl wolf-sheep example doesn't use @multiagent which would also avoid the branching (if I have understood this well as well). But that's another story irrelevant to this PR of course.

@ameligrana

Copy link
Copy Markdown
Member Author

no I think it's much more important in complex simulations where the conditions I mentioned are met. The data layout is arguably the most important thing in most simulations actually. See e.g. this video https://www.youtube.com/watch?v=kKGiEz1enzw, the best single-threaded ECS is 30x faster than the OOP way. This map similarly to Julia world, since mutable structs are very similar to OOP in terms of performance (in fast languages).

@ameligrana

Copy link
Copy Markdown
Member Author

I also just noticed that the Agents.jl wolf-sheep example doesn't use @multiagent which would also avoid the branching (if I have understood this well as well). But that's another story irrelevant to this PR of course.

It doesn't use it because if I remember correctly I benchmarked it and it was slower in this case.

@Datseris

Datseris commented Feb 2, 2026

Copy link
Copy Markdown
Member

I thought the data layout was very similar to the struct vector data layout you recently added to agents , and this did not have anywhere near as much of a performance impact. Of course ultimately the struct vector in agents.jl has to play well with all other existing infrastructure so I assumme this limits its potential. But then again, there is no "query" logic in Agents.jl for the struct vector, so maybe this is where this performance change is coming from. Initially I Thought the "query" logic of Ark.jl was, at least for these simple simulations, essentially keeping track of the agent id, now just the index in a vector, to reuse it for different attributes)

@ameligrana

Copy link
Copy Markdown
Member Author

yes, the query logic adds more power since queries are the fastest operation in an ECS, we don't actually use it any model here since the scheduler requires random ordering (but actually now that I think about it, it's probably possible to add a general facility to Ark to shuffle the columns themselves so that one can use queries after that). It's true though that for Schelling and Flocking simulations the StructVector of Agents is very similar, I think that indeed using StructVector will help to close the gap on them (not completely I guess since it still does more computations when you retrieve multiple properties I think). Though WolfSheep is more complex, and the StructVector approach of Agents is not made for multiple agent simulation, indeed that is what an ECS is for, WolfSheep is already complex enough in terms of data layout that an ECS is better, if you add more and more "types", then an ECS will become much faster than the Dict of Unions used in Agents, a dictionary like this will be very slow.

@ameligrana

ameligrana commented Feb 2, 2026

Copy link
Copy Markdown
Member Author

Ah, right, indeed StructVector doesn't support neither multiple types nor removals. While an ECS support all of this in a very fast way.

@ameligrana ameligrana merged commit a64a803 into main Feb 2, 2026
3 checks passed
@ameligrana ameligrana deleted the ark branch February 2, 2026 23:35
@ameligrana

Copy link
Copy Markdown
Member Author

By the way, yes, I verified that some things match statistically (like wolves and sheeps numbers)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants