For pure coding tasks and refactoring, Claude 3.5 Sonnet is currently the strongest performer. It tends to hallucinate less on specific library syntax compared to the others.
However, for creative writing or "reasoning" through complex logic puzzles, I've found Gemini (specifically the Advanced/Ultra tiers) to have a more natural "voice" and better instruction following for long contexts.
GPT-4o is still the best generalist, but it feels like it has softened slightly while the others have specialized.