Unstable Power: How Sharpness Drives Deep Network Learning

The paper “Understanding the Evolution of the Neural Tangent Kernel at the Edge of Stability” by Kaiqi Jiang, Jeremy Cohen, and Yuanzhi Li explores how the Neural Tangent Kernel (NTK) evolves during deep network training, especially under the Edge of Stability (EoS) regime.

What is the NTK?

The Neural Tangent Kernel (NTK) is a matrix that captures how tiny weight changes affect network outputs on each training example.
It lets us analyze neural networks with tools from kernel methods, offering theoretical insights into learning dynamics.

What is the Edge of Stability?

When training with a large learning rate $\eta$, the largest eigenvalue of the NTK (or the loss Hessian) exceeds the stability threshold $2/\eta$ and then oscillates around it.
This phenomenon, called Edge of Stability, combines elements of instability with phases of rapid learning.

Key Findings

Alignment Shift
Higher $\eta$ leads to stronger final Kernel Target Alignment (KTA) between the NTK and the label vector $y$.
Link to EoS Phases
- During phases when sharpness decreases, KTA exhibits sudden jumps.
- When sharpness increases, KTA grows more slowly or even drops temporarily.
Theoretical Analysis in a Linear Model
In a simplified two‑layer linear network, decreasing sharpness boosts components aligned with $y$. The authors prove that EoS inherently drives an alignment mass shift toward the top NTK eigenvectors.
Central Flows Framework
Modeling gradient descent as a sharpness‑penalized flow shows that the natural learning trajectory enhances Kernel Target Alignment.

Why It Matters

Deepens our theoretical understanding of gradient descent dynamics beyond small‑step regimes.
Suggests new optimization and regularization techniques leveraging the Edge of Stability to improve feature learning.

📎 Links

Based on the publication 📄 2507.12837

What is the NTK?#

What is the Edge of Stability?#

Key Findings#

Why It Matters#

📎 Links#

What is the NTK?

What is the Edge of Stability?

Key Findings

Why It Matters

📎 Links