Skip to content

[PR-3] feat: implement ETR (Eq. 66) and primitive contraction for OS+HGP pipeline#237

Draft
San1357 wants to merge 6 commits intotheochem:masterfrom
San1357:feat/ETR-contraction
Draft

[PR-3] feat: implement ETR (Eq. 66) and primitive contraction for OS+HGP pipeline#237
San1357 wants to merge 6 commits intotheochem:masterfrom
San1357:feat/ETR-contraction

Conversation

@San1357
Copy link

@San1357 San1357 commented Mar 4, 2026

Summary

What Changed

File Lines What
gbasis/integrals/_two_elec_int_improved.py +150 ETR + contraction implementation
tests/test_two_elec_int_improved.py +80 5 ETR + norm tests

Mathematical Reference

The Electron Transfer Recursion (Eq. 66) is:

$$[c+1_i, d | a, b] = [c, d | a+1_i, b] + (A-C)_i [c, d | a, b]$$

where angular momentum is transferred from center A to center C.

How To Test

# Run tests
pytest tests/test_two_elec_int_improved.py -v

# Check coverage
pytest tests/test_two_elec_int_improved.py --cov=gbasis.integrals._two_elec_int_improved --cov-report=term-missing

# Check formatting & linting
black --check gbasis/integrals/_two_elec_int_improved.py
ruff check gbasis/integrals/_two_elec_int_improved.py

# Verify existing tests not broken
pytest tests/ --timeout=120

Proof That It Works

Tests (9 passed)

image

Check coverage

image

Black: Clean & Ruff: Clean

image

Existing tests: Not broken

300 passed, 279 skipped, 1 xfailed, 4 warnings in 67.86s (0:01:07)

First Checklist

  • Tests added for each new function
  • All 9 tests pass
  • Black formatted
  • Ruff linting clean
  • Existing tests not broken
  • Docstrings in numpydoc format
  • No new dependencies added (uses existing numpy)
  • Pure Python only (no C/Fortran)

Scope of this PR

  • This PR adds ETR and contraction as part of the OS+HGP pipeline (Week 3)
  • Adds _electron_transfer_recursion (ETR)
  • Adds _optimized_contraction with per-center vectorized norm computation
  • Adds functools.cache to _get_factorial2_norm for performance
  • Does NOT modify any existing code
  • Does NOT change any existing tests
  • Next step: HRR (Week 4)

Second Checklist

  • Write a good description of what the PR does.
  • Add tests for each unit of code added (e.g. function, class)
  • Update documentation
  • Squash commits that can be grouped together
  • Rebase onto master

Type of Changes

Type
🐛 Bug fix
✨ New feature
🔨 Refactoring
📜 Docs

from gbasis.utils import factorial2

# Cache for factorial2 values to avoid repeated computation
_FACTORIAL2_CACHE = {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python has a simple tool for this type of caching since Python 3.9(??):
https://docs.python.org/3.14/library/functools.html#functools.cache

You can just decorate your function:

@functools.cache
def _get_factorial2_norm(angmom_components):
   ...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @msricher
Done! Replaced _FACTORIAL2_CACHE = {} with @functools.cache decorator.
Since functools.cache requires hashable arguments and NumPy arrays are not hashable, the function now accepts a tuple of tuples instead of an array.

return _FACTORIAL2_CACHE[key]


def _optimized_contraction(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if a better function signature would just be def f(integrals, exps, coeffs, angmoms):?

Then you can use it like this, if all of the exps, coeffs, and angmoms come from indexing the corresponding arrays:
f(integrals, exps[(a, b, c, d),], coeffs[(a, b, c, d),], angmoms[(a, b, c, d),])

I'm not sure if the angmoms are extracted from an array, though... but in general using fewer function arguments by applying NumPy's advanced indexing, or by other grouping strategies, is cleaner. Is it faster? Not sure, but the difference is probably not much.

Just some thoughts I had reading this part.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Simplified the signature as suggested:

_optimized_contraction(integrals_etransf, exps, coeffs, angmoms)

where:

  • exps = (exps_a, exps_b, exps_c, exps_d)
  • coeffs = (coeffs_a, coeffs_b, coeffs_c, coeffs_d)
  • angmoms = (angmom_a, angmom_b, angmom_c, angmom_d)

The values are unpacked inside the function.
is this ok? @msricher

Contracted integrals with shape (c_x, c_y, c_z, a_x, a_y, a_z, M_a, M_c, M_b, M_d).
"""
# Precompute normalization constants (1D arrays)
norm_a = (2 * exps_a / np.pi) ** 0.75 * (4 * exps_a) ** (angmom_a / 2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part would definitely benefit from using arrays and vectorized operations with NumPy.

Copy link
Author

@San1357 San1357 Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msricher

For suggestion #3 (vectorizing line 82) -

The ETR loops have sequential dependencies where each step c+1 uses the previous step c, so full vectorization isn't straightforward.

Could you clarify which specific part you had in mind?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

angmoms = np.array([l_a, l_b, l_c, l_d])
exps = np.array([e_a, e_b, e_c, e_d])
coeffs = np.array([c_a, c_b, c_c, c_d])
# etc. ...
norms = ((2 / np.pi) * exps) ** 0.75 * (4 * exps) ** (angmoms / 2)
coeffs_norm = coeffs * norms[:, :, np.newaxis] # or something like this, there's some
                                               # broadcasting you can do to make it work

I just mean that if you are going to pass these arguments in as single arrays, then there's a bit more vectorization you can do with the (a, b, c, d) centers.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood! @msricher,
I was confused, now I've Vectorized the norm computation over all 4 centers as suggested. Now using NumPy arrays and computing norms in one shot using broadcasting — all 4 centers (a, b, c, d) are handled together instead of separately.

The caller-side passing of single arrays (e.g. exps[(a,b,c,d),]) will be handled in the full pipeline when HRR is integrated. For now, the conversion to arrays is done internally inside the function.

Does this look good to you? @msricher

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually after testing , I realized that using np.array stacking breaks when shells have different numbers of primitives (K). For example, if K_a=2 and K_b=1, np.array([exps_a, exps_b]) creates a ragged array and fails. So I've used per-center list comprehension instead — the norm computation is still vectorized using NumPy ops per center, just not across all 4 centers simultaneously

- zeta_over_eta * integrals_etransf[:, :, c, :, :, 2:, ...]
)

return integrals_etransf
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest of it, the actual contractions and recurrences, seems good to me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback! @msricher

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants