Skip to content

Claude/optimize packmol turbo wl4 eu#122

Closed
MicheleBonus wants to merge 41 commits intom3g:masterfrom
MicheleBonus:claude/optimize-packmol-turbo-Wl4EU
Closed

Claude/optimize packmol turbo wl4 eu#122
MicheleBonus wants to merge 41 commits intom3g:masterfrom
MicheleBonus:claude/optimize-packmol-turbo-Wl4EU

Conversation

@MicheleBonus
Copy link

No description provided.

…-packmol-memgen

Optimize hot cell-neighbor loops in computef/computeg
Optimize pairwise collision kernels for faster Packmol iterations
…s-using-offset-helper

Refactor computef/computeg to use shared forward cell offsets
…parc-and-gparc

Split pair-interaction kernels into fast/short/fixed paths and prefilter active neighbor heads
…h-arrays-in-compute_data.f90

Add hot-path scalar buffers and use them in pairwise kernels
…-with-per-cell-maxima

Add per-cell radius bounds and PBC-aware cell-pair pruning
…e-calculations

Optimize squared terms in hot atom-pair kernels
…ions-and-documentation

Add build profiles (baseline, perf-native, devel, sanitize, static) and numerics check
…provements-for-computing

docs: add Michele Bonus as contributor and update README build/profile and performance notes
…nd-ambiguous-references

Fix pgencan build failure caused by ambiguous `x` symbol
…ents-and-errors

Import `init1` from `compute_data` in `pgencan`
…ys-in-compute_data.f90

Rename compute_data hot buffers to x_hot/y_hot/z_hot and restrict imports
…gh-risk-fortran-files

Narrow compute_data imports in collision-prone routines
…ation-and-warnings-in-gencan.f

gencan: initialize CG/trust-region scalars and guard conditional reads
…-arguments-in-subroutines

Restore full `evalhd` ABI and mark dummy API arguments as intentionally unused in GENCAN stubs
MicheleBonus and others added 11 commits March 11, 2026 16:09
…-resetcells-subroutine

Wrap long short-radius assignment in resetcells
…zation-in-packmol.f90

Defensively handle missing atom-level restriction mapping
…alidate-residue-bounds

Validate fixed-molecule residue bounds before residue counting
…n-computef.f90

Reformat long continued assignments in src/computef.f90
…mance-and-refactor-expressions

Optimize cell-neighbor evaluation and fix overlong Fortran lines
Three key changes that together yield ~25% speedup:

1. Remove hot buffer abstraction (x_hot, y_hot, z_hot, ibtype_hot, ibmol_hot)
   - xcart(:,1) is already contiguous in Fortran column-major layout,
     so separate x_hot(:) arrays provide zero cache benefit for the
     random linked-list access pattern in fparc/gparc
   - Eliminates refresh_hot_buffers_full (5 full-array copies per call)
     and refresh_hot_buffers_atom (per-atom copy overhead)
   - Use xcart/ibtype/ibmol directly in inner loops

2. Remove dead cell-level radius pruning infrastructure
   - cell_pair_min_dist2 always returns 0 for the 14 forward neighbor
     offsets because cells are sized >= interaction radius, making all
     neighbor pairs adjacent (zero gap distance)
   - The pruning check (min_cell_dist2 > max_reach2) was never true
   - Removes cell_max_radius/cell_max_short_radius tracking from both
     the cell-placement loop and resetcells, plus the reach computation
     and cell_pair_min_dist2 function from the neighbor-offset loop

3. Incremental cell reset
   - Walk the previous iteration's occupied-cell linked list to clear
     only those cells, instead of zeroing all ncells^3 entries
   - Also skip clearing latomnext since all placed atoms overwrite it

4. Pre-compute fixed_short_marker once at init instead of per-atom
   per-iteration (fixedatom and use_short_radius are static)

Benchmarks (baseline -O2):
  water_box_pbc:    0.41s -> 0.32s  (22% faster)
  solvprotein_pbc:  6.12s -> 4.56s  (26% faster)

https://claude.ai/code/session_01SgmhQ2p78sPjFPxZPmLkk7
@MicheleBonus MicheleBonus closed this by deleting the head repository Mar 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants