Skip to content

feat: support tensor properties in flat batches#38

Merged
Ramlaoui merged 1 commit into
mainfrom
feat/tensor-property-batch-flat
Jun 5, 2026
Merged

feat: support tensor properties in flat batches#38
Ramlaoui merged 1 commit into
mainfrom
feat/tensor-property-batch-flat

Conversation

@Ramlaoui
Copy link
Copy Markdown
Collaborator

@Ramlaoui Ramlaoui commented Jun 5, 2026

Summary

  • support stacked tensor ndarrays in add_arrays_batch for molecule and atom custom properties
  • flat-batch tensor sections when shapes are compatible, with clear errors for ragged shapes
  • reject list/tuple tensor inputs while preserving list[str] molecule properties

Validation

  • cargo fmt --all --check
  • cargo check -p atompack -p atompack-py
  • uv run --no-sync --extra dev --locked --with "maturin>=1.4,<2.0" maturin develop
  • uv run --no-sync --extra dev --locked pytest tests/test_database.py -q
  • cargo clippy --workspace --all-targets -- -D warnings
  • uv run --no-sync --extra dev --locked ruff check tests/test_database.py
  • git diff --check

@Ramlaoui
Copy link
Copy Markdown
Collaborator Author

Ramlaoui commented Jun 5, 2026

Performance check for the tensor property stack, measured against detached worktrees at origin/main and origin/feat/tensor-property-batch-flat with release builds and ATOMPACK_PERF_COLOR=never.

Canonical smoke targets found and run:

  • make perf-smoke-py -> atompack-py/tests/test_throughput_smoke.py
  • make perf-smoke-rust -> atompack/tests/throughput_smoke.rs

Median throughput over 4 samples each:

Path main feature delta
Python write add_arrays_batch 648,502 mol/s 625,127 mol/s -3.6%
Python write add_molecules 377,789 mol/s 344,842 mol/s -8.7%
Python sequential get_molecule 1,119,850 mol/s 1,089,962 mol/s -2.7%
Python shuffled get_molecule 1,115,842 mol/s 1,096,318 mol/s -1.7%
Python batch get_molecules 1,117,152 mol/s 1,150,652 mol/s +3.0%
Python flat get_molecules_flat 5,706,899 mol/s 5,605,078 mol/s -1.8%
Rust write 428,378 mol/s 438,570 mol/s +2.4%
Rust sequential read 932,678 mol/s 944,008 mol/s +1.2%
Rust random read 896,412 mol/s 934,798 mol/s +4.3%
Rust shuffled batch read 1,503,248 mol/s 1,569,769 mol/s +4.4%

The flat/batched read paths, which are the sensitive paths for tensor concatenation, do not show a slowdown in this smoke. The Python object write path is the only short-sample dip, but it still passes the existing smoke threshold by a wide margin. I also ran a feature-branch-only ad hoc check for explicit set_property(..., scope="atom") object writes: molecule-scope vector median ~495k mol/s, atom-scope vector median ~484k mol/s over 3 samples.

@Ramlaoui Ramlaoui force-pushed the feat/tensor-property-batch-flat branch from 3ce31b1 to 57b0454 Compare June 5, 2026 10:31
@Ramlaoui
Copy link
Copy Markdown
Collaborator Author

Ramlaoui commented Jun 5, 2026

Update after the add_molecules performance diagnosis in the core PR:

The earlier Python object-write dip came from Python binding clone overhead, not the batch/flat tensor concatenation path. Core PR #36 now includes 792b1bb, which passes borrowed owned molecules to Rust instead of cloning every PyMolecule during Database.add_molecules.

This branch was rebased onto the updated core branch and force-pushed. Post-rebase validation:

  • uv run --no-sync --extra dev --locked pytest tests/test_database.py -q -> 60 passed
  • ATOMPACK_PERF_COLOR=never make perf-smoke-py -> passed

Latest smoke numbers on this branch:

  • write add_arrays_batch: ~679,706 mol/s
  • write add_molecules: ~585,616 mol/s
  • batch get_molecules: ~1,142,901 mol/s
  • flat get_molecules_flat: ~4,942,872 mol/s

Batch and flat read paths remain comfortably above the smoke thresholds, and the object-write path no longer shows the previous regression shape.

Base automatically changed from feat/tensor-property-core to main June 5, 2026 12:31
@Ramlaoui Ramlaoui force-pushed the feat/tensor-property-batch-flat branch from 57b0454 to 60d1c42 Compare June 5, 2026 12:33
@Ramlaoui Ramlaoui merged commit 40be9cb into main Jun 5, 2026
4 of 5 checks passed
@Ramlaoui Ramlaoui deleted the feat/tensor-property-batch-flat branch June 5, 2026 12:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant