2 pointsby eric26754 hours ago1 comment
  • eric26754 hours ago
    The author is here.

    I'm writing this purely out of curiosity about a "flaw" in DeepSeek-R1. It solves the AIME problem (loop closure) perfectly, but seems to confidently produce illusions when the domain is open.

    My background is applied topological modeling. I wanted to see if it was mathematically possible to prove that when the "real-world anchor" (which I call Delta_Phi) is removed, "insight" (fast convergence) and "illusion" (local minima) are actually the same function.

    I know the notation here is a bit abstract, but I'm curious: how do you think we can reintroduce "pain" or "grounding" into a purely reinforcement learning model?

    I'd love to discuss the underlying mathematical principles or philosophical ideas.