However, OpenAI had and still has a true lead on voice model interactions. That’s where Chinese AI companies don’t do as well: deepseek doesn’t have anything or like Kimi that can speak out in any language except English or Chinese.
Anyone know how much audio is 1M tokens? I have no way of knowing if this is fine or prohibitively expensive.
Says the team at OpenAI whose job it is to ensure you thought that.
Presumably because it’s genuinely useful - I can easily think of applications to make with a powerful voice ui.