Ik almost everyone is interested just in the SWE stuff, but this has been a good eval for me to think about how big the model is, how "creative" it is for generating new ideas etc.
More results from fable, with comparisons for Gemini, opus and some open source models: https://mesmer.tools/benchmarks/ai-video-generation