I will also add that based on my current experiments, the ideal number of models in a routing pool is probably 2, following point 4 above. Each model needs to have significant differences in either quality, speed, or cost, otherwise routing decisions are hard to make and become less accurate; the benefit of routing is also less. For coding, the ideal pool in my opinion is GPT 5.4 and DeepSeek V4 Pro, to extend the GPT quota by routing some of the medium and easy requests.
[0] role-model - the case for a model routing protocol: https://try.works/role-model-the-case-for-a-model-routing-pr...