45 pointsby PaulHoule7 hours ago3 comments
  • brantmv2 hours ago
    Maybe I'm wrong, but it looks like the authors did not actually have any LLMs write or verify any code for their experiments. Instead, their experiments consist of simulating the simplified Markov chain model itself. They simulated their simple Markov chain and checked if the theorem's predictions matched empirical statistics. This amounts to a test not of their model, but of basic Markov chain theory.

    Did I misread or miss something?

  • mapontosevenths2 hours ago
    This line made me pause:

    "We prove that for any non-zero stage success probability, the system reaches the verified state almost surely"

    What's the point if its still stochastic?

    • jaggederest2 hours ago
      "almost surely" means "happens with a probability 1", which in infinite set contexts doesn't mean that there aren't other outcomes, but that they have probability 0.

      So like, imagine that you had some finite list of integers, and you were picking a random number from 0 to infinity - because the domain is infinite, any finite set has 0 probability, but that doesn't mean it doesn't exist.

      https://en.wikipedia.org/wiki/Almost_surely

      • mapontosevenths21 minutes ago
        Thank you. That makes this a pretty big deal doesn't it?

        The ability to deterministcly identify that code eventually reaches a halting state, implies that we can use these stochastic tools to generate deterministic outcomes reliably in the future doesn't it?

    • IanCal2 hours ago
      Hash collisions are possible but can be provably so rare that they’re not a relevant concern.
  • edwardtay5 hours ago
    [flagged]
    • XzAeRosho4 hours ago
      >2. How do you handle the semantic gap? LLMs operate in natural language/fuzzy logic space, while formal methods require precise specifications. What's the translation layer like?

      From what I understood, this validates the output correctness, not that the output aligns with the user goals, so there's still room for the LLM to get the user goals wrong, and this is only to validate the mathematical consistency between the output code and the formal specification (in this paper, within the ESBMC framework for C++ code).

      So it's kind of tight scoped, in this case, but I think it points in the right direction for coding assistants, which usually get some language primitives wrong.

      • 4 hours ago
        undefined
    • westurner4 hours ago
      > intersection of LLMs and formal verification

      /? TLA LLM https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu... : 1 submission, ~20 comments

      "AI will make formal verification go mainstream" (2025-12) https://news.ycombinator.com/item?id=46294574