Structural stability architecture for self-modifying optimisation systems. Defines control-theoretic mechanisms that preserve coherence before value alignment.
-
Updated
Mar 12, 2026
Structural stability architecture for self-modifying optimisation systems. Defines control-theoretic mechanisms that preserve coherence before value alignment.
On the infantile expectation of controlling what we cannot comprehend. A philosophical critique of the ASI control paradigm, developed through four-AI adversarial debate. Extension of the Coherence Basin Hypothesis
A structural account of why honesty may be the path of least resistance for superintelligence. Research hypothesis with formal proof, experimental design, and four-AI collaborative analysis
Rigorous framework for evaluating AI alignment properties — sycophancy, corrigibility, deception, goal stability, and power-seeking — with statistical confidence intervals
Add a description, image, and links to the corrigibility topic page so that developers can more easily learn about it.
To associate your repository with the corrigibility topic, visit your repo's landing page and select "manage topics."