1 pointby enochyearn2 hours ago1 comment
  • enochyearn2 hours ago
    I watched a video about DeepSeek’s mHC and couldn’t stop thinking about it, so I implemented a minimal version in MLX over the weekend and used it in place of residual connections to compare against a baseline ResNet.

    Ran a quick stability check at depth=500 on Fashion-MNIST, no divergence, and the results were better than I expected.