I’m especially curious about the “self-learning loop” — in practice, does it actually improve outcomes over time, or does it tend to reinforce suboptimal patterns?
And How much autonomy do the agents actually have in practice?
I’ve found that fully autonomous loops tend to need a lot of guardrails to stay useful.