3 pointsby maxloh5 hours ago2 comments

boxcakean hour ago
Ok - but what was it trained on? How much was copied from source material and how much was novel.
I could train a model that creates Shakespearean plays with a 10 word prompt - a 219 word spec is pretty meaningless for a model specifically trained on prior architectures, and specifically for this task.
I think this sounds more impressive than it is.
theihtisham4 hours ago
The interesting part isn’t “AI designed a CPU” so much as what counts as design completion here. For hardware, I’d want to separate spec interpretation, RTL generation, simulation coverage, synthesis results, timing closure, and debug effort after first failure. A lot of the real work shows up in the edges between those steps. If they can publish pass/fail criteria and the amount of human correction required after generation, that would make this much easier to evaluate.