NerVE: Nonlinear Eigenspectrum Dynamics in LLM Feed-Forward Networks (ICLR 2026)
transformers feedforward-neural-network representation-learning spectral-analysis adam-optimizer deep-learning-framework interpretable-machine-learning llms muon-optimizer eigenspectrum
-
Updated
Mar 16, 2026 - Jupyter Notebook