Some Math behind Neural Tangent Kernel
Analysis
The article introduces the Neural Tangent Kernel (NTK) as a tool to understand the behavior of over-parameterized neural networks during training. It highlights the ability of these networks to achieve good generalization despite fitting training data perfectly, even with more parameters than data points. The article promises a deep dive into the motivation, definition, and convergence properties of NTK, particularly in the context of infinite-width networks.
Key Takeaways
- •NTK is a kernel used to explain the training dynamics of neural networks.
- •Over-parameterized neural networks can generalize well despite fitting training data perfectly.
- •The article will explore the convergence properties of NTK, especially in infinite-width networks.
Reference
“Neural networks are well known to be over-parameterized and can often easily fit data with near-zero training loss with decent generalization performance on test dataset.”