Most skeptical imo have not used yet enough tokens on codebase in loops with multiple models and all kind of regression tests to say that it's not on-par yet with great developers, it might not think, it might not understand what it does, but the outcome is there, and it's getting solid, and whatever gap there is now, we must already assume it's solved when we talk about this "technology". I dare anyone to use 1B tokens on any sort of codebase and tell me that there isn't a drastic improvement from a fully human-made one.
In my team, we are at a point where even reading the output of AI is irrelevant, it should be summarized/enhanced by another model (in loops, always), people still using Claude Code or basic tools like this and waiting are using it wrong imo. Claude Code/Codex/Gemini CLIs should just be leveraged to not having to pay the API price, that's pretty much it, but "vibe coding" makes no-sense, why do you need to "follow up" if you are sure of the goal? To me it seems that users vibe coding just didn't define entirely the goal/specs/tests prior and this is why they must still follow what the agent is doing, but this is far from the real capabilities which is to practically be able to dump hundred of tasks and most of them will just be done.
It should always be AI first, then human in the loop only when debating in loop between models isn't enough or if they can't figure out the best course of action.