12 pointsby Stan_Byriukova month ago2 comments
  • barrkela month ago
    This would be more credible if it stuck to computational primitives and didn't bring in "reasoning" or "decision manifold".

    Stick to the math, and make an argument that the hardware behaviour drifts outside documented bounds, taking into account the existing non-determinism in the system, e.g. CUDA thread atomics, or batch sizes and layouts if you're layering concurrency on top.

  • bigyabaia month ago
    For one, this is not really a whitepaper. For two, this C code doesn't pass the most cursory of smell checks. And three, your commit messages seem to confirm that most of this is LLM-generated.

    You might have some good research here, but it's buried under LLM slop. This style of writeup is not likely to grab Nvidia's attention, I think.