1 pointby Yuriy_Bakhvalov5 hours ago1 comment
  • Yuriy_Bakhvalov5 hours ago
    Hi HN, I'm the author. This is a 4-paper cycle where I derive the kernel from first principles. Key features: No SGD, 500 layers. Happy to answer questions!