Skip to content

Fix: relax the parameter constraints of device DSP to allow bndpar+kpar#370

Closed
Cstandardlib wants to merge 462 commits into
abacusmodeling:developfrom
deepmodeling:dsp/loose
Closed

Fix: relax the parameter constraints of device DSP to allow bndpar+kpar#370
Cstandardlib wants to merge 462 commits into
abacusmodeling:developfrom
deepmodeling:dsp/loose

Conversation

@Cstandardlib
Copy link
Copy Markdown
Contributor

Linked Issue

After #7357 it is possible to do large scale calculation with KPAR/BNDPAR on multiple DSP nodes.

Now no need to restrict NPROC to KPAR.

dyzheng and others added 30 commits January 15, 2026 21:46
Co-authored-by: root <root@LAPTOP-C2B3O75T.localdomain>
…6857)

* Fix: do not overwrite k-point weights for non-MP k-point lists

* Docs: update k-point weight documentation for symmetry handling
Co-authored-by: Fei Yang <2501213217@stu.pku.edu.cn>
* Add GPU tests for dav_subspace

* Correct readme for pw gpu test

* Update CASES_GPU test PW_DS_GPU
* Increase md_nstep from 3 to 4

* Increase md_nstep from 3 to 4 in INPUT file
* Feature: Support ML EXX for training script.

* Update the interface to libnpy

* Refactor: Update the interface of libnpy in ml_tools

* Refactor: Implement the class ML_Base, which is the base class of KEDF_ML

* Feature: Add support to ML_EXX for KSDFT and OFDFT

* Fix: Update hamilt_pw.cpp

* Update ml_base.h and ml_base.cpp

* Fix: Modify pot_ml_exx.cpp to avoid negative value of rho

* Divide ml_base.cpp to ml_base.cpp and ml_base_pot.cpp

* Fix: Update pot_ml_exx.cpp
* Refactor: save memory for kinetic and overlap force and stress

* Test: add UT for ekinetic_new and overlap_new

* Fix: error of force and stress after refactor

* Fix: UT for ekinetic and overlap

* Fix: gamma_only error of force_stress of edm

* Refactor: unify force/stress calculation for overlap and ekinetic operators

* Fix: overlap force stress error for nspin=2

* split test to serial part and parallel part

---------

Co-authored-by: dyzheng <zhengdy@bjaisi.com>
…les (#6878)

* update the examples of 02_NAO_Gamma

* update

* udpate

* update

* update tests in 02_NAO_Gamma

* small updates of write_HS.hpp

* update the format of H(k) and S(k)

* update write_HS.hpp

* update

* update the number of md steps to make it equal to the input parameter, now md steps starts from 1, originally it starts from 0

* update 02_NAO_Gamma examples

* add examples 002 and 003 in 02_NAO_Gamm

* update examples 41 and 42

* updates of 43 and 57 examples

* update example 17 in 03_NAO_multik

* update 44 example of 03_NAO_multik

* update 092 in 01_PW

* update 01_PW examples

* update 04_FF examples

* update 05_rtTDDFT examples

* update 06_SDFT examples

* update 07_OFDFT examples

* update 15_rtTDDFT_GPU examples

* update 16 and 17 examples in 15_rtTDDFT_GPU

* update 02

* fix bug

* fix bug

* update

* update 16_SDFT_GPU

* update

* update 02 data

* update 005 example in 02_NAO_Gamma

* add 006 in 02

* update CASES_CPU.txt

* fix a bug in 08_EXX 06

* fix bugs

* update alllog test

* fix a bug, when reading the orbital files and something went banana, the code should not quit immediately

* update of some formats

* fix a small bug

* update examples in 03_NAO_multik

* update

* update 35 example for pchg

* update dipole output in rt examples

* update 01 example in rt-TDDFT

* update rt-TDDFT input files

* update some INPUT files in rt-TDDFT

* Fix: Add missing return true in read_orb_file function to prevent double free error

* fix unittests

* update CASES_CPU.txt in 03_NAO_multik

* Modify output filename from INPUT to INPUT.info in driver.cpp

* update catch_properties

---------

Co-authored-by: abacus_fixer <mohanchen@pku.eud.cn>
Replace token-based authentication with OIDC (OpenID Connect) for codecov-action.
This is more secure and eliminates the need to manage upload tokens.

Changes:
- Add use_oidc: true to codecov-action configuration
- Add id-token: write permission at workflow level
- Remove token parameter from codecov-action (ignored when using OIDC)

This improves security and follows codecov-action best practices.

Generated by the task: njzjz-bot/njzjz-bot#25.
…esolver (#6892)

* Refactor: Encapsulate timer functionality in timer_wrapper.h

* Refactor timer code and clean_esolver function

1. Remove #ifdef __MPI from timer code, encapsulate in timer_wrapper.h
2. Move ESolver clean logic to after_all_runners method
3. Replace clean_esolver calls with direct delete p_esolver
4. Remove #ifdef __MPI from delete p_esolver
5. Add Cblacs_exit(1) in after_all_runners for LCAO calculations

---------

Co-authored-by: abacus_fixer <mohanchen@pku.eud.cn>
* Refactor: Encapsulate timer functionality in timer_wrapper.h

* Refactor timer code and clean_esolver function

1. Remove #ifdef __MPI from timer code, encapsulate in timer_wrapper.h
2. Move ESolver clean logic to after_all_runners method
3. Replace clean_esolver calls with direct delete p_esolver
4. Remove #ifdef __MPI from delete p_esolver
5. Add Cblacs_exit(1) in after_all_runners for LCAO calculations

* Refactor: Move heterogeneous parallel code to source_base/module_device

---------

Co-authored-by: abacus_fixer <mohanchen@pku.eud.cn>
* Feature: add Hessian operator <\phi|\nabla_x\nabla_y|\phi>

* fix: UT of twocenterintegral

---------

Co-authored-by: dyzheng <zhengdy@bjaisi.com>
* Refactor: Encapsulate timer functionality in timer_wrapper.h

* Refactor timer code and clean_esolver function

1. Remove #ifdef __MPI from timer code, encapsulate in timer_wrapper.h
2. Move ESolver clean logic to after_all_runners method
3. Replace clean_esolver calls with direct delete p_esolver
4. Remove #ifdef __MPI from delete p_esolver
5. Add Cblacs_exit(1) in after_all_runners for LCAO calculations

* Refactor: Move heterogeneous parallel code to source_base/module_device

* Refactor heterogeneous parallel code and migrate exx_info to module_xc

1. Refactor global.h:
   - Removed heterogeneous parallel code (CUDA/ROCm error checking macros)
   - Added include for source_base/module_device/device_check.h
   - Removed GlobalC::exx_info declaration

2. Migrate exx_info:
   - Added GlobalC::exx_info declaration to exx_info.h
   - Created exx_info.cpp with GlobalC::exx_info definition
   - Removed exx_info definition from global.cpp
   - Removed duplicate exx_info definition from exx_helper.cpp

3. Update build system:
   - Added exx_info.cpp to xc_ library in CMakeLists.txt
   - Added exx_info.o to OBJS_XC in Makefile.Objects
   - Fixed formatting in Makefile.Objects

4. Ensure compatibility:
   - Verify pure PW compilation works with exx_info.cpp
   - Verify GPU compilation works with refactored code

This refactoring improves code modularity by separating heterogeneous parallel functionality from global variables and moving EXX-related global variables to their own module.

---------

Co-authored-by: abacus_fixer <mohanchen@pku.eud.cn>
…kage (#6898)

* Refactor: Encapsulate timer functionality in timer_wrapper.h

* Refactor timer code and clean_esolver function

1. Remove #ifdef __MPI from timer code, encapsulate in timer_wrapper.h
2. Move ESolver clean logic to after_all_runners method
3. Replace clean_esolver calls with direct delete p_esolver
4. Remove #ifdef __MPI from delete p_esolver
5. Add Cblacs_exit(1) in after_all_runners for LCAO calculations

* Refactor: Move heterogeneous parallel code to source_base/module_device

* Refactor heterogeneous parallel code and migrate exx_info to module_xc

1. Refactor global.h:
   - Removed heterogeneous parallel code (CUDA/ROCm error checking macros)
   - Added include for source_base/module_device/device_check.h
   - Removed GlobalC::exx_info declaration

2. Migrate exx_info:
   - Added GlobalC::exx_info declaration to exx_info.h
   - Created exx_info.cpp with GlobalC::exx_info definition
   - Removed exx_info definition from global.cpp
   - Removed duplicate exx_info definition from exx_helper.cpp

3. Update build system:
   - Added exx_info.cpp to xc_ library in CMakeLists.txt
   - Added exx_info.o to OBJS_XC in Makefile.Objects
   - Fixed formatting in Makefile.Objects

4. Ensure compatibility:
   - Verify pure PW compilation works with exx_info.cpp
   - Verify GPU compilation works with refactored code

This refactoring improves code modularity by separating heterogeneous parallel functionality from global variables and moving EXX-related global variables to their own module.

* Move GlobalC::restart to source_io/restart

1. Move GlobalC::restart declaration from global.h to restart.h
2. Move GlobalC::restart definition from global.cpp to restart.cpp
3. Keep the same functionality and usage
4. Improve code modularity by centralizing restart-related code in source_io module
5. Ensure compatibility with both pure PW and GPU compilation modes

* Remove unnecessary global.h includes and fix line_search.cpp compilation error

* update global.h

* update global.h

* update global.h

* update global.h

* update stress_pw.cpp

* update global.h

* update global.h in module_pwdft

* update global.h

* update module_stodft

* delete global.h in source_io

* fix source_io

* delete inclusion of global.h in source_io

* Refactor: Remove unnecessary includes and clean up global.h references

* delete global.h in source_lcao

* update

* update

* fix

* update

* fix

* update source_cell

* update source_esolver

* update esolver

* update

* update module_charge

* update module_pot

* continue

* update fix

* fix

* update dftu

* update deepks

* ifx

* delete globalc.h in module_ri

* fix

* fix

* fix dftu_io

* fix diago_lapack.cpp

* updates

* update rdmft

* update module_rt

* update td operator

* update module_pwdft/operator etc

* solve fft

* update xc

* update

* delete global.h and global.cpp, finally after nearly 20 years

* fix op_exx_lcao

* fix

* fix

---------

Co-authored-by: abacus_fixer <mohanchen@pku.eud.cn>
The gint_gpu_vars.h file already exists in the kernel directory.
This temp_gint directory was left over from a previous refactoring.

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* Add files via upload

* Add files via upload

* Add files via upload

* Add files via upload

* Delete source/ctrl_output_td.h

* Add files via upload

* Add files via upload

* Add files via upload

* Add files via upload

* Update td_info.cpp

* Update td_current_io_comm.cpp

---------

Co-authored-by: Mohan Chen <mohanchen@pku.edu.cn>
* Fix: Add override to Pot_ML_EXX::cal_v_eff to avoid compilation warning.

* Fix: Provide a clearer, friendlier error when ML KEDF is used without ENABLE_MLALGO.

* Fix: Add validation for out_elf and spin=4 combo.
* Refactor: Encapsulate timer functionality in timer_wrapper.h

* Refactor timer code and clean_esolver function

1. Remove #ifdef __MPI from timer code, encapsulate in timer_wrapper.h
2. Move ESolver clean logic to after_all_runners method
3. Replace clean_esolver calls with direct delete p_esolver
4. Remove #ifdef __MPI from delete p_esolver
5. Add Cblacs_exit(1) in after_all_runners for LCAO calculations

* Refactor spar_exx.h: add English comments and improve dependency structure

- Added detailed English comments to cal_HR_exx function
- Moved implementation to cpp file and added explicit instantiations
- Improved header file organization with sections
- Removed unnecessary LCAO_hamilt.hpp include
- Enhanced endif comments for better code readability

* Remove empty LCAO_hamilt.hpp file

The LCAO_hamilt.hpp file was empty after moving its implementation to spar_exx.cpp.
This commit removes the unused header file and updates all references to it.

* Fix circular dependency between exx_info.h and xc_functional.h

- Removed #include xc_functional.h from exx_info.h
- Removed #include exx_info.h from xc_functional.h
This breaks the circular dependency between these two header files, allowing them to compile independently.

* Fix dependencies in LCAO sparse format headers

- Removed unnecessary #include source_lcao/hamilt_lcao.h from spar_dh.h, spar_hsr.h, and spar_u.h
- Added direct dependencies to spar_dh.h: matrix.h, parallel_orbitals.h, two_center_bundle.h, ORB_read.h
- Adjusted include order in spar_hsr.h and spar_u.h
- Added necessary include to spar_hsr.cpp for HamiltLCAO access

* Add necessary includes to cpp files for compilation

- Added xc_functional.h include to esolver_ks_pw.cpp for XC_Functional class access
- Added xc_functional.h include to input_conv.cpp for XC_Functional class access
- Added parallel_comm.h include to op_exx_pw.cpp for KP_WORLD communication
- Added global_variable.h and exx_info.h includes to stress_pw.cpp for GlobalC namespace access
These changes fix compilation errors caused by the dependency refactoring.

* add #include <RI/global/Tensor.h> in spar_hsr.h

---------

Co-authored-by: abacus_fixer <mohanchen@pku.eud.cn>
Co-authored-by: Xiaoyang Zhang <tsfxwbbzxy@163.com>
Critsium-xy and others added 29 commits May 15, 2026 06:28
Force UTF-8 at the top of install-abacus.bat, uninstall-abacus.bat, and
the generated abacus.cmd / abacus-mpi.cmd launchers via `chcp 65001` and
`set WSL_UTF8=1`. Without these, wsl.exe emits UTF-16LE when its stdout
is piped, so the `for /f` that captures `wslpath` output mangles any
CJK characters in the repository or case directory path, and the
launchers' `wsl --cd "%CD%"` fails from a Chinese-named directory.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* change a long name 'transfer_dm_2d_to_gint' to 'transfer_dm2d'

* print out the 'real' number of k points to screen

* convert output START CHARGE to upper case

* update memory warning information

* update outputs

* fix

* fix

* fix memory

---------

Co-authored-by: abacus_fixer <mohanchen@pku.eud.cn>
* move memory.h to record_memory.h

* fix

* fix

* refactor: rename record_memory to memory_recorder

This commit renames the memory recording module from record_memory to memory_recorder.

**File Renames:**
- source/source_base/record_memory.h → memory_recorder.h
- source/source_base/record_memory.cpp → memory_recorder.cpp

**Header File Updates (35 files):**
Updated all #include references from "source_base/record_memory.h" to "source_base/memory_recorder.h" in:
- source_pw: sto_wf.cpp, sto_elecond.cpp, vnl_pw.cpp, structure_factor.cpp
- source_lcao: biggrid_info.cpp, gint_info.cpp, dftu.cpp, hcontainer.cpp, dspin_lcao.cpp,
               hamilt_lcao.cpp, FORCE_gamma.cpp, FORCE_k.cpp, edm.cpp, klist_1.cpp
- source_hsolver: hsolver_lcao.cpp, parallel_k2d.cpp, diago_cg.cpp
- source_io: cal_test.cpp, write_wfc_nao.cpp
- source_estate: density_matrix.cpp, density_matrix_io.cpp, potential_new.cpp, charge.cpp
- source_esolver: esolver_sdft_pw.cpp, esolver_of_tool.cpp
- source_basis: pw_basis_k.cpp, two_center_bundle.cpp, ORB_gaunt_table.cpp
- source_base: memory_recorder.cpp, memory_test.cpp, sph_bessel_recursive-d2.cpp,
               memory_op.cpp, memory_op.cu
- source_psi: psi_prepare.cpp
- source_main: driver.cpp

**CMakeLists.txt Updates (11 files):**
Updated all source file references from record_memory.cpp to memory_recorder.cpp in:
- source_relax/test/CMakeLists.txt
- source_pw/module_pwdft/test/CMakeLists.txt
- source_md/test/CMakeLists.txt
- source_io/test/CMakeLists.txt
- source_hamilt/module_xc/test/CMakeLists.txt
- source_cell/test/CMakeLists.txt
- source_basis/module_pw/kernels/test/CMakeLists.txt
- source_basis/module_pw/test/CMakeLists.txt
- source_basis/module_ao/test/CMakeLists.txt
- source_base/test/CMakeLists.txt
- source_base/test_parallel/CMakeLists.txt
- source_base/CMakeLists.txt

**Makefile Updates (1 file):**
- source_lcao/module_deepks/test/Makefile.Objects: Updated record_memory.o to memory_recorder.o

This change improves the naming consistency and clarity of the memory recording module.

---------

Co-authored-by: abacus_fixer <mohanchen@pku.eud.cn>
* Refactor: split RI_2D_Comm::split_m2D_ktoR()

* Feature: add OpenMP in RI_2D_Comm::split_m2D_ktoR()

* Feature: update OpenMP in RI_2D_Comm::split_m2D_ktoR_gamma()

* Feature: update OpenMP in RI_2D_Comm::split_m2D_ktoR_k()

---------

Co-authored-by: linpz <linpz@mail.ustc.edu.cn>
* change the output of step of electron evolve, starts from 1

* update dipole outputs

* 改进 dipole_io 模块:优化性能和代码结构

1. 增强 write_dipole 函数:
   - 添加 ofs_running 参数,支持自定义输出流
   - 移除未使用的头文件引用(charge.h, evolve_elec.h)
   - 统一电子偶极矩计算算法,移除 ifdef MPI 分支
   - 预计算倒网格维度倒数,优化性能
   - 添加除零保护检查
   - 添加 OpenMP 并行化,使用 reduction 累加
   - 提取公共打印函数 printDipoleMoment

2. 改进 prepare 函数:
   - 使用 switch 语句替代 if-else 链
   - 移除中间变量,直接返回结果

3. 更新 dipole_io.h:
   - 添加 UnitCell 头文件引用
   - 调整参数顺序,添加 ofs_running 在前

4. 添加详细注释:
   - 函数文档注释
   - 物理公式说明
   - 代码实现细节

5. 优化代码风格:
   - 使用小写常量名(small_value)
   - 优化变量作用域
   - 改进错误处理

* delete uselss #ifdef __LCAO

* Refactor: migrate write_dipole from ctrl_output_td to ctrl_output_fp

This refactoring extends the dipole output functionality to all DFT solvers
by moving it from the TDDFT-specific module to the common output module.

Changes:
1. source/source_io/module_ctrl/ctrl_output_fp.cpp
   - Add include for dipole_io.h
   - Add dipole output functionality after out_xc_r section
   - Now supports out_dipole parameter for all FP-based solvers

2. source/source_io/module_ctrl/ctrl_output_td.cpp
   - Remove duplicate dipole output code (migrated to ctrl_output_fp)
   - Remove unnecessary dipole_io.h include
   - Renumber comments (1) for current, (3) for restart

3. source/source_io/module_dipole/write_dipole.cpp
   - Rename printDipoleMoment to print_dipole_moment (snake_case convention)

Benefits:
- OFDFT, KSDFT (PW/LCAO), SDFT, and TDDFT all gain dipole output capability
- Single implementation point reduces maintenance burden
- Consistent behavior across all solver types
- Better code organization following the base class aggregation pattern

* Cleanup: remove unused 'is' parameter from write_dipole

The 'is' (spin channel index) parameter was not used in the write_dipole function:
- Electron dipole moment uses rho_save (already selected by spin)
- Ionic dipole moment uses ucell (spin-independent)
- File naming is handled by the caller through 'fn' parameter

Changes:
1. dipole_io.h - Remove 'is' parameter from function declaration
2. write_dipole.cpp - Remove 'is' parameter from function definition
3. ctrl_output_fp.cpp - Update function call to remove 'is' argument

* update index.rst

* add back #ifdef __LCAO

* add output_dipole.md

* add INPUT

* fix

---------

Co-authored-by: abacus_fixer <mohanchen@pku.eud.cn>
* fix data race

* fix potential data race

* fix data race in pw

* fix potential data race

* update format

* fix data race in fft

* fix

* Fix data race in evolve_ofdft.cpp for TDOFDFT calculation. Cache shared member variables (nspin, nrxx, npw, gg, tpiba, tpiba2, gcar) to local const variables before OpenMP parallel regions to eliminate ThreadSanitizer false positive warnings.

* Fix OpenMP barrier placement in pw_gatherscatter.h. The #pragma omp barrier directive must be inside an explicit #pragma omp parallel region, not after #pragma omp for which ends the implicit parallel region. This fixes undefined behavior when compiled with OpenMP enabled.

* Remove obsolete OMP barrier comments from pw_gatherscatter.h

* Rename local variables with underscore suffix in pw_gatherscatter.h to distinguish from member variables (nst -> nst_, nz -> nz_, etc.)

* Rename local variables with underscore suffix in pw_transform.cpp to distinguish from member variables (nrxx -> nrxx_, npw -> npw_, nxyz -> nxyz_, nst -> nst_, nz -> nz_, nx -> nx_, ny -> ny_, nplane -> nplane_, ig2isz -> ig2isz_)

* Rename local variables with underscore suffix in fft_cpu.cpp to distinguish from member variables (npy -> npy_, nx -> nx_, lixy -> lixy_, rixy -> rixy_, nplane -> nplane_, and FFTW plan objects)

* Remove redundant #pragma omp barrier directives in pw_gatherscatter.h. The #pragma omp for directive already performs implicit synchronization at the end of the loop, making explicit barrier redundant.

* add WARNING_QUIT

---------

Co-authored-by: abacus_fixer <mohanchen@pku.eud.cn>
…ount) (#7357)

dspInitHandle uses MY_RANK % dsp_count but dspDestoryHandle used raw MY_RANK, causing heap corruption when MY_RANK >= dsp_count. Fixes issue #7269.
…md > 1` evolution strategy (#7360)

* Remove unnecessary cout in TDDFT current file

* Fix RT-TDDFT EXX bug when using estep_per_md

* Modify cout format

* Fix a compiling issue with respect to std::vector

* Update test 08_EXX/14_NO_TDDFT_PBE0
…guard (#7361)

* refactor(device): remove dead code from DeviceContext, add dsp_count guard

Remove unused device_type subsystem from DeviceContext:

- Delete set_device_type(), get_device_type(), is_cpu(), is_gpu(), is_dsp() methods (all zero callers verified via exhaustive search)
- Delete is_initialized(), is_gpu_enabled() (zero callers)
- Delete device_type_ private field (only consumed by removed methods)
- Delete standalone get_device_type(const DeviceContext*) function (zero callers; all 48 call sites use the template version get_device_type(const Device*))
- Delete forward declaration in device_helpers.h

Add assert(PARAM.inp.dsp_count > 0) guard in driver.cpp to prevent
modulo-by-zero undefined behavior.

All other DeviceContext members retained (init(), get_device_id(),
get_device_count(), get_local_rank() — all have active callers).
Build verified with cmake --build (MPI+LCAO).

* fix(dsp): replace assert with runtime WARNING_QUIT for dsp_count

assert() is removed in release builds (NDEBUG), leaving modulo-by-zero\nunprotected. Replace with WARNING_QUIT that works in all builds.\n\nAlso remove now-unused #include <cassert> from the #ifdef __DSP block.\n\nAddresses PR review feedback on #7361.
* Remove useless headers

* add type.h include

* Remove headers in test files
- set_phi_dphi_kernel: add WantPhi non-type template parameter and
  dispatch from the launch site. The dphi-only callers (gint_tau)
  pass phi=nullptr; with WantPhi==false the compiler drops the
  phi[] stores and the per-iw `phi != nullptr` branch entirely.
- phi_dot_dphi_kernel / phi_dot_dphi_r_kernel: replace the shared-
  memory tree reduce with a single-warp warpReduceSum and drop the
  dynamic shared-memory allocation at the launch sites. Launch
  configuration is pinned at blockDim.x == 32; a comment guards the
  invariant.
- Plain `if` (not `if constexpr`) on WantPhi keeps the code
  C++11-compliant — ABACUS targets C++11 and nvcc otherwise emits
  warning #2912-D. WantPhi is still a non-type template parameter,
  so the compiler folds the constant and eliminates the dead branch.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ecision loss (#7368)

Across CPU and GPU gint paths, accumulator buffers (hr_gint, phi_dm, rho,
and the vbatched GEMM C output) are now always allocated as double, even
when the input phi/dm/vr_eff are fp32. Multiplies stay in fp32 (cheap),
but per-block and global reductions are widened to fp64 so that summing
many atom-pair contributions into the same element does not drift.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Remove parameter.h

* Continue remove parameter.h

* Remove parameter.h dependency in pw_basis

* Remove dependency in pw_basis_k
)

1. Add check_value callback to yukawa_potential to error out when
   both yukawa_potential and uramping are enabled simultaneously
2. Skip uramping_update() when Yukawa is enabled (U calculated directly
   from charge density every iteration)
3. Return true from u_converged() when Yukawa is enabled (U is
   self-consistently calculated, no ramping convergence needed)
* Remove parameter.h

* Continue remove parameter.h

* Remove parameter.h dependency in pw_basis

* Remove dependency in pw_basis_k

* refactor(source_basis): remove last parameter.h dependencies

Decouple module_ao and module_nao from source_io/parameter.h:

- ORB_atomic_lm / ORB_nonlocal_lm: replace PARAM.globalv.global_out_dir
  with ModuleBase::get_quit_out_dir() (new getter mirroring the existing
  set_quit_out_dir injection point).
- two_center_bundle: thread orbital_dir as a build_orb parameter; replace
  the two deepks_setorb guards with ndesc>0 / alpha_ non-null checks that
  are equivalent under the build_alpha invariant.

source_basis is now free of parameter.h.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(tool_quit): rename get_quit_out_dir to get_global_out_dir

Align the getter name with the original PARAM.globalv.global_out_dir it replaces.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(tests): add tests/17_DS_DFTU test suite for DFT+U with deep potential spin constraints

* feat(tests): add tests/17_DS_DFTU with CI-disabled configs and READMEs

Add the 17_DS_DFTU test suite for DeltaSpin and DFT+U functionality:
- 47 test cases covering LCAO/PW basis, collinear/noncollinear spin,
  DFT+U, DeltaSpin, and their combinations
- Comment out tests in tests/CMakeLists.txt and tests/17_DS_DFTU/CMakeLists.txt
  to prevent CI failure until DeltaSpin code is merged into develop
- Add single-line README to each test directory (printed during Autotest.sh)
- Rewrite CASES_CPU.txt with clear English comments explaining disabled tests
* Remove debug output in renormalize_psi

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Fix: CD potential now applied to correct spin channel instead of always spin 0

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Refactor: Replace raw new/delete with std::vector in cal_vw_potential_phi for automatic memory management

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Refactor: Replace raw new/delete with std::vector in cal_CD_potential for automatic memory management

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Refactor: Replace abs(x)*abs(x) with std::norm for clarity and consistency with RK2

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Refactor: Remove unused <iostream> include

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Refactor: Remove dead nspin <= 0 checks that can never trigger

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
… diagonalization (#7388)

* fix(sdft): add CT (Chebyshev Trace) iter_header for pure SDFT without diagonalization

Pure SDFT (nbands=0) does not perform KS diagonalization, yet the SCF
iteration table borrowed the ks_solver label (CG/DA/etc.). Add a "CT"
entry to iter_header_dict and use it when esolver_type=sdft with nbands=0.
Mixed SDFT (nbands>0) keeps the actual ks_solver label since it still
diagonalizes KS orbitals.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(sdft): add unit tests for SDFT iter_header CT label

Verify pure SDFT (nbands=0) outputs "CT" in ITER column, and mixed
SDFT (nbands>0) outputs the actual ks_solver label (e.g. "DA").

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
* update 18_md examples

* update out_chg 2

* update out_pot function

* feat(module_io): 优化初始电荷密度/势能输出,支持 out_freq_ion 控制和动态文件名

- 添加 gen_ini_filename() 辅助函数,统一生成初始电荷密度/势能文件名
- out_freq_ion=0 时输出单个固定名称文件(不带 g#)
- out_freq_ion>0 时每个几何步输出独立文件(带 g#)
- 更新文档,说明两种模式的区别

修改文件:
- docs/advanced/output_files/output-specification.md
- source/source_io/module_chgpot/write_init.cpp
- source/source_io/module_chgpot/write_init.h
- source/source_io/module_parameter/read_input_item_output.cpp

* fix(module_io): 修正 out_freq_ion=0 时初始电荷密度/势能输出逻辑

- out_freq_ion=0 时,每个几何步都输出(覆盖同一个文件)
- out_freq_ion>0 时,只在 istep 是 out_freq_ion 倍数时输出
- 更新所有相关文档和注释

修改文件:
- docs/advanced/output_files/output-specification.md
- source/source_io/module_chgpot/write_init.cpp
- source/source_io/module_chgpot/write_init.h
- source/source_io/module_parameter/read_input_item_output.cpp

* fix a bug about out_pot

* fix bugs

* update

* update ELF and add openmp parallel

* update elf

* update elf

* update example reference data

* enable elf for rt-tddft, but results are wrong

* fix elf test

* fix elf test in 03_NAO_multik

* fix output of write_elf

* fix bug

* update potential file, fix bug

* fix elf test in ofdft

* fix: Move write_pot_init to ElecState::init_scf for correct timing

The write_pot_init was being called in ESolver_FP::before_scf before the
effective potential was computed. This caused pot_ini.cube to contain:
- All zeros for calculation=scf / first ionic step (istep=0)
- Converged potential from previous ionic step for relax/md with istep>0

The fix moves write_pot_init to ElecState::init_scf, which is called after
pot->init_pot(charge) computes the effective potential from the initial
charge density. This ensures pot_ini.cube correctly contains the effective
potential corresponding to the initial charge density.

Changes:
- Modified ElecState::init_scf signature to accept istep, out_dir, inp parameters
- Added write_pot_init call after pot->init_pot() in init_scf
- Updated pw::setup_pot to pass through the new parameters
- Updated all callers (LCAO and PW) to provide the new parameters
- Removed the premature write_pot_init call from ESolver_FP::before_scf

* Remove unused parameters from ElecState::init_scf

- Removed unused 'symm' and 'wfcpw' parameters from init_scf function
- Updated all call sites to match the new signature
- Simplified function interface by removing parameters not used in implementation

* Fix missing io_basic library link in elecstate tests

- Added io_basic library dependency to MODULE_ESTATE_elecstate_base test
- Added io_basic library dependency to MODULE_ESTATE_elecstate_pw test
- Fixes undefined reference to ModuleIO::write_pot_init

* update init_scf

* fix

* fix bug

* remove dependence of parameter for write_cube.cpp

* fix bugs

* fix bug

* add a new file init_scf

* update estate tests

* delete useless inclusion

---------

Co-authored-by: abacus_fixer <mohanchen@pku.eud.cn>
* Fix: use correct PSI for GPU in cal_tau for ELF

* Refactor: split long cast into readable lines in after_scf
…#7418)

* Remove obsolete cross-device copy constructor in HamiltPW

* Delete corresponding .h code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.