Nice idea. Can I choose a strategy for token reduction, based on what I'm optimzing for? I might be ok with a quality drop for a great cost savings, for example.
Yeah, you can set roughly the target ratio through the api (for example, target_ratio=.3), though our api will try to maximize the quality given the this target ratio (and it might add a couple more tokens to do so)