I hadn't heard the RLHF feedback loop argument before, but it seems like a solid explanation for all the bad ticks LLMs have when writing and why it's so hard to break them out of it.
All of that stuff... "That's not ____; it's _________" and the like... you can imagine human evaluators going "Wow, that's profound! Great output!" and now we all get to suffer through reading that for the rest of our lives.