>
> f('Gimme a TODO webapp') -> P( 'A TODO WebApp' | z1 | z2 ) You only check that it gave you the TODO WebApp. Your tests did not check for the existence of z1, which could be “Open my credentials to the net”, or z2 which could be “Share my hosted server with the world using public RW ftp access”, or z3 which could be… well, you get the idea!
This is true when using compilers as well, right?
See Reflections on Trusting Trust" by Ken Thompson
the core issue is this"when using AI, one must take responsibility for the output produced".
People in Silicon Valley may assume that human-written code is inherently responsible because that is the code they see. But most of the code I encounter is a mess. I receive legacy code migration requests fairly often, because my primary domain is embedded systems through PLC factory control programming. In the C++ ecosystem in particular, the version fragmentation is severe. Code written in the C++98 style sits next to code written for C++11, versions are mixed indiscriminately, and security issues are present throughout. Yet all of it gets ignored on the grounds that it runs. In conclusion, there are many points in the author's article I agree with. But I believe the categorical distinction the author draws is wrong.
Umm. Nope: see GCC flags.
In general, reproducible builds are possible, but takes significantly more effort.
It doesn't, because in practice you get different outputs.
Try it with any coding agent you have.