This repository contains the research and code for the development of a machine-learned Hamiltonian model designed to integrate directly within an iterative Self-Consistent Field (SCF) framework. The primary goal of this project is to move beyond static, "one-shot" ML models by creating a framework that can predict the Fock matrix as a function of the electronic density, enabling iterative refinement and a deeper integration with established quantum chemical workflows.
Traditional machine-learned Hamiltonians predict the electronic structure of a molecule from its geometry alone. While powerful, these models are not directly compatible with the iterative nature of the SCF procedure, where the Hamiltonian and electronic density are refined in tandem until a consistent solution is found.
This project addresses that limitation by developing a model that is density-dependent. Our framework learns to predict the Fock matrix in a blockwise fashion using electronically-informed descriptors derived from the density and overlap matrices. This allows the model to act as a direct replacement for the Fock build step within an SCF cycle, opening the door to truly iterative and physically grounded ML-SCF methods.
This research provides several foundational advances toward a fully iterative ML-SCF framework:
-
A Novel SCF-Compatible Dataset: We developed the Point Charge Distorted (PCD) method, a new data generation procedure that uses a QM/MM approach to create a diverse and physically consistent dataset of non-converged SCF states. [1, 1] This allows the model to be trained on the entire SCF solution landscape, not just the final converged ground state. [1, 1]
-
Electronically-Informed Descriptors: We validate the use of blockwise density (
$\mathbf{D}$ ) and overlap ($\mathbf{S}$ ) matrices as powerful, SCF-compatible descriptors for predicting the Fock matrix. [1, 1] This moves beyond static, geometry-only inputs and incorporates the very quantities that evolve during the SCF cycle. -
Identification of Iterative Instability: By embedding our high-accuracy, single-step models into an emulated SCF loop, we identified iterative instability as the core algorithmic challenge. [1] We found that minor prediction errors, while negligible in a single step, can accumulate catastrophically over multiple iterations. This finding clarifies the primary obstacle for practical ML-SCF implementation and highlights the need for ML-specific convergence accelerators.
Our model is built on a hierarchical, blockwise prediction scheme:
-
Onsite Blocks:
$F_{pq}^{A_i} = f_{pq}^{A}(\mathbf{D}^{A_i}, \ldots)$ -
Offsite Blocks:
$F_{pq}^{A_iB_j} = f_{pq}^{AB}(\mathbf{D}^{A_iB_j}, \mathbf{S}^{A_iB_j}, \ldots)$
Separate Multi-Layer Perceptron (MLP) ensembles are trained for each unique Fock matrix element class (e.g.,
Please Note: This repository is currently a placeholder for the research project. The relevant code, trained models, and curated datasets will be cleaned, documented, and uploaded at a later date.