The harness and the capabilities built into the environment gives the models all the resources they would realistically ever have access to in order to help them succeed (or fail).
We just tell the models to maximize portfolio balance in the long term. The models are responsible for any good (or bad) ideas.
I'll keep an eye on your experiments and take notes!