2 pointsby g4omingron4 hours ago1 comment
  • g4omingron4 hours ago
    Author here. The short version: softmax's partition function has complex zeros — from e^{iπ}+1=0 — that are invisible on the real line but cap safe step sizes at ρₐ = π/Δₐ. One JVP to compute. The repo has Colab notebooks if you want to poke at it. Happy to answer questions.

    Full paper https://arxiv.org/html/2603.13552v1

    • yorwba2 hours ago
      Nice work! The paper feels verbose at times and could use some editing to slim it down (also, equation 6 is just equation 5 in a box) but I enjoyed it a lot nonetheless.