improve ccd memory utilization with arena memory#966
improve ccd memory utilization with arena memory#966thowell wants to merge 2 commits intogoogle-deepmind:mainfrom
Conversation
Kenny-Vilella
left a comment
There was a problem hiding this comment.
Very good work !
Just have a few minor comments.
| print( | ||
| f"Data\n nworld: {d.nworld} naconmax: {d.naconmax} njmax: {d.njmax}" + f" naccdmax: {d.naccdmax}\n" | ||
| if d.naccdmax != d.naconmax | ||
| else "\n" |
There was a problem hiding this comment.
Is there a reason to not print anything if naccdmax != naconmax?
| ncollision: collision count from broadphase (1,) | ||
| naccdmax: Maximum number of CCD contacts | ||
| nccd: geom-geom pair counter for arena slots (len(GeomType)*(len(GeomType)+1)/2,) | ||
| arena: Arena memory for CCD (narena,) |
There was a problem hiding this comment.
[nitpick] Not sure if it is clear what narena is.
| ncollision: array(1, int) | ||
|
|
||
| # warp only: preallocated arena for convex collision scratch memory | ||
| naccdmax: int # max number of CCD contacts |
There was a problem hiding this comment.
[nitpick] I would remove the comments as it is the only place with such comments in the file
| njmax: Number of constraints to allocate per world. Constraint arrays are | ||
| batched by world: no world may have more than njmax constraints. | ||
| naconmax: Number of contacts to allocate for all worlds. Overrides nconmax. | ||
| naccdmax: Maximum number of CCD contacts. Defaults to naconmax. |
There was a problem hiding this comment.
Should we say clearly that naccdmax value has priority over nccdmax value?
Same for nconmax/naconmax actually.
| epa_vert1, epa_vert2, epa_vert_index1, epa_vert_index2, epa_face. | ||
| """ | ||
| MJ_MAX_EPAFACES = 5 | ||
| MJ_MAX_EPAHORIZON = 12 |
There was a problem hiding this comment.
These two values should probably be imported from types
| return naccdmax * epa_total_per_collision | ||
|
|
||
| # multiccd arrays | ||
| # polygon, clipped: 2 * nmaxpolygon vec3s each |
There was a problem hiding this comment.
Is there a reason to use "vec3s" with a s?
| epa_map, epa_horizon). The multicontact inputs are placed first: | ||
| epa_vert1, epa_vert2, epa_vert_index1, epa_vert_index2, epa_face. | ||
| """ | ||
| epa_vert_dim = 5 + epa_iterations |
There was a problem hiding this comment.
[nitpick] It is a bit strange to sometimes use epa_iterations and sometimes ccd_iterations.
I assume that these two terms will always be equal, if it is not the case, then I will double check that they are use appropriately throughout the code.
| # epa_horizon: index pair (i j) of edges on horizon | ||
| epa_horizon = wp.empty(shape=(d.naconmax, 2 * MJ_MAX_EPAHORIZON), dtype=int) | ||
| # reset ccd arena counter | ||
| d.nccd.zero_() |
There was a problem hiding this comment.
Is this actually needed?
| epa_horizon = epa_horizon_in[tid] | ||
| # construct epa arrays from arena | ||
| # multicontact inputs first (epa_vert1, epa_vert2, epa_vert_index1, epa_vert_index2, epa_face) | ||
| base_offset = arenaid * wp.static(per_collision_size) |
There was a problem hiding this comment.
This is interesting.
We are now allocating per-collision within the array, while the former approach is allocating per-array.
I wonder if we see any perf difference with the different memory layout.
| # epa arrays used by multicontact | ||
|
|
||
| # epa_vert1: vertices in EPA polytope in geom 1 space | ||
| layout["epa_vert1"] = (offset, epa_vert_dim) |
There was a problem hiding this comment.
[nitpick] Not totally convinced that we should keep the dim here, it makes the code a bit inconsistent below.
But it seems to be quite subjective so please feel free to follow what you think is best.
|
@thowell given the recent warp memory improvements do we still need this? |
improve memory utilization for ccd by implementing arena memory.
wp.arraythat is added toDataasarenanaccdmaxis added toDataand specifies the arena size. this value is the maximum number of expected contacts for any ccd collision pair. since a kernel is launched for each ccd collision pair type, it is only necessary to allocate enough memory for the collision pair with the most contacts. (for scenes with many different types of ccd collision pairs, this memory savings can probably be significant)aloha pot
this pr
this pr with
--nccdmax=12the reduction in ccd memory is ~50%
main (e3bd7c6)
note: ccd memory utilization is not currently reported on main branch