Skip to content

Optimize cache reuse and enhance benchmarking scripts#7440

Open
Aunixt wants to merge 16 commits into
deepmodeling:developfrom
mystic-qaq:feat/cache-reuse
Open

Optimize cache reuse and enhance benchmarking scripts#7440
Aunixt wants to merge 16 commits into
deepmodeling:developfrom
mystic-qaq:feat/cache-reuse

Conversation

@Aunixt
Copy link
Copy Markdown

@Aunixt Aunixt commented Jun 6, 2026

Reminder

  • Have you linked an issue with this pull request?
  • Have you added adequate unit tests and/or case tests for your pull request?
  • Have you noticed possible changes of behavior below or in the linked issue?
  • Have you explained the changes of codes in core modules of ESolver, HSolver, ElecState, Hamilt, Operator or Psi? (ignore if not applicable)

Linked Issue

Fix #...

Unit Tests and/or Case Tests for my changes

  • A unit test is added for each new feature or bug fix.

What's changed?

This pull request introduces a comprehensive caching system for the plane-wave basis (PW_Basis) and its k-point extension (PW_Basis_K). The main goals are to improve memory management, reduce redundant computation, and provide detailed cache statistics. The changes replace manual memory management with smart pointers, add thread-safe cache validation with hit/miss counters, and introduce new methods to inspect and reset cache usage. The most important changes are:

Caching and Memory Management Improvements

  • Replaced manual delete[]/new[] for cache arrays (gg, gdirect, gcar, ig2igg, gg_uniq, gk2) with std::unique_ptr-backed storage in both PW_Basis and PW_Basis_K, ensuring safer and more robust memory management. Public pointers now act as non-owning views.
  • Added thread-safe cache validation flags and mutexes to prevent redundant computation and ensure correctness in concurrent environments.

Cache Statistics and API Enhancements

  • Introduced cache hit/miss counters and new API methods (get_cache_stats, reset_cache_stats) in both PW_Basis and PW_Basis_K for monitoring cache effectiveness and memory usage.
  • Added explicit cache invalidation on device/precision changes and during key setup operations to ensure cache coherence.

Performance and Code Quality Improvements

  • Optimized setupIndGk in PW_Basis_K to avoid redundant G+K cutoff scans by reusing selected indices, reducing computational overhead.
  • Replaced manual heap allocation for temporary arrays in collect_uniqgg with std::vector for exception safety and clarity.

These changes make the codebase safer, more efficient, and easier to maintain, while providing better visibility into cache performance.

Any changes of core modules? (ignore if not applicable)

  • Example: I have added a new virtual function in the esolver base class in order to ...

Copilot AI review requested due to automatic review settings June 6, 2026 04:55
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR introduces host/device caching for plane-wave basis data (both Γ-only and k-dependent variants), adds cache statistics APIs, and updates tests to validate cache hits/misses and invalidation behavior.

Changes:

  • Added cache buffers, validity flags, mutex/atomics, and stats reporting in PW_Basis and PW_Basis_K.
  • Added cache invalidation hooks across initialization paths (initmpi, initgrids, initparameters, etc.).
  • Updated unit tests to assert caching correctness and fixed an invalid base-class call pattern.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
source/source_basis/module_pw/test_serial/pw_basis_k_test.cpp Extends tests to cover PW_Basis_K cache behavior and invalidation.
source/source_basis/module_pw/test/test1-1-1.cpp Adds cache stats assertions for PW_Basis (collect_local_pw, collect_uniqgg) and invalidation checks.
source/source_basis/module_pw/test/test-big.cpp Fixes base setuptransform() invocation now that PW_Basis is non-copyable.
source/source_basis/module_pw/pw_init.cpp Invalidates caches when MPI/grid/parameter/fullpw settings change.
source/source_basis/module_pw/pw_gatherscatter.h Minor whitespace-only adjustments.
source/source_basis/module_pw/pw_distributeg.cpp Invalidates caches when distribution maps are rebuilt.
source/source_basis/module_pw/pw_basis_k.h Adds k-cache stats and k-cache invalidation/storage members.
source/source_basis/module_pw/pw_basis_k.cpp Implements k-cache storage + stats, avoids duplicate cutoff scans, and adds device sync helpers.
source/source_basis/module_pw/pw_basis.h Adds cache stats API, non-copyable semantics, and cache storage/locks.
source/source_basis/module_pw/pw_basis.cpp Implements cached collect_local_pw/collect_uniqgg, cache stats, and owned-cache cleanup.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +164 to +174
virtual void invalidate_cache()
{
this->local_pw_cache_valid.store(false);
this->uniqgg_cache_valid.store(false);
this->gg = nullptr;
this->gdirect = nullptr;
this->gcar = nullptr;
this->ig2igg = nullptr;
this->gg_uniq = nullptr;
this->ig_gge0 = -1;
}
Comment on lines +176 to +186
void clear_owned_cache();

// Public gg/gcar/gdirect pointers are non-owning views of these cache buffers.
std::atomic<bool> local_pw_cache_valid{false};
std::atomic<bool> uniqgg_cache_valid{false};
mutable std::mutex cache_mutex;
std::unique_ptr<double[]> gg_cache_storage;
std::unique_ptr<ModuleBase::Vector3<double>[]> gdirect_cache_storage;
std::unique_ptr<ModuleBase::Vector3<double>[]> gcar_cache_storage;
std::unique_ptr<int[]> ig2igg_cache_storage;
std::unique_ptr<double[]> gg_uniq_cache_storage;
Comment on lines +114 to +120
void invalidate_cache() override
{
PW_Basis::invalidate_cache();
this->gcar_cache_valid.store(false);
this->gk_cache_valid.store(false);
this->gk2 = nullptr;
}
Comment on lines 13 to +16
this->poolnproc = poolnproc_in;
this->poolrank = poolrank_in;
this->pool_world = pool_world_in;
this->invalidate_cache();
Comment on lines +202 to +215
EXPECT_EQ(stats_after_hits.gcar_hits, 2);
EXPECT_EQ(stats_after_hits.gcar_misses, 1);
EXPECT_EQ(stats_after_hits.gk2_hits, 1);
EXPECT_EQ(stats_after_hits.gk2_misses, 2);
EXPECT_GT(stats_after_hits.cache_bytes, 0);
basis_k.initparameters(gamma_only_in, gk_ecut_in, nks_in, kvec_d_in, distribution_type_in, xprime_in);
EXPECT_EQ(basis_k.gcar, nullptr);
EXPECT_EQ(basis_k.gk2, nullptr);
EXPECT_EQ(basis_k.get_gcar_data<double>(), nullptr);
EXPECT_EQ(basis_k.get_gk2_data<double>(), nullptr);
EXPECT_EQ(basis_k.get_k_cache_stats().cache_bytes, 0);
EXPECT_EQ(basis_k.npw,3695);
EXPECT_EQ(basis_k.npwk_max,2721);
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants