The hardest thing I think is judging whether your AGENTS.md is still good based on each model release. OpenAI does release prompting guidance however to help this (and have added a skills to apply this to your prompts IIRC)
Seems like the progressive disclosure approach is the best for context efficiency; I wound up with a somewhat tight generic AGENTS.md, and the .cursor/rules individual files with glob matching for file names. Cursor honored those well.
I must have spent a couple hundred on the company dime having the models rephrase/rewrite or change where instructions were found, what made sense as a skill vs a rule, trying to keep things as portable as possible. At this point the Cursor-specific files would need to be ported to a different agent/framework if it needed to be. But the content should be pretty solid.
It was an interesting (and productive) exploration for me
And even when we do compare, the thermal values, the entropy of our systems: that alone can lead us down very different paths. Even when all the rigging is controlled. (Which implies we need multiple experiments to compare against.)