Unstable Power: How Sharpness Drives Deep Network Learning
The paper “Understanding the Evolution of the Neural Tangent Kernel at the Edge of Stability” by Kaiqi Jiang, Jeremy Cohen, and Yuanzhi Li explores how the Neural Tangent Kernel (NTK) evolves during deep network training, especially under the Edge of Stability (EoS) regime. What is the NTK? The Neural Tangent Kernel (NTK) is a matrix that captures how tiny weight changes affect network outputs on each training example. It lets us analyze neural networks with tools from kernel methods, offering theoretical insights into learning dynamics. What is the Edge of Stability? When training with a large learning rate $\eta$, the largest eigenvalue of the NTK (or the loss Hessian) exceeds the stability threshold $2/\eta$ and then oscillates around it. This phenomenon, called Edge of Stability, combines elements of instability with phases of rapid learning. Key Findings Alignment Shift Higher $\eta$ leads to stronger final Kernel Target Alignment (KTA) between the NTK and the label vector $y$. ...