Now GPT4.1 was another story last year, I remember cooking at 4am pacific and feeling the whole thing slam to a halt as the US east coast came online.
The most reliable time to see it fall apart is when Google makes a public announcement that is likely to cause a sudden influx of people using it.
And there are multiple levels of failure, first you start seeing iffy responses of obvious lesser quality than usual and then if things get really bad you start seeing just random errors where Gemini will suddenly lose all of its context (even on a new chat) or just start failing at the UI level by not bothering to finish answers, etc.
The sort of obvious likely reason for this is when the models are under high load they probably engage in a type of dynamic load balancing where they fall back to lighter models or limit the amount of time/resources allowed for any particular prompt.
I just assume it went to the bar, got wasted, and needed time to sober up!
I jokingly (and not so) thought that it was trained on data that made it think it should be tired at the end of the day.
But it is happening daily and at night.
[1] https://www.anthropic.com/engineering/a-postmortem-of-three-...
People put forward many theories for this (weaker model routing; be it a different model, Sonnet or Haiku or lower quantized Opus seem the most popular), Anthropic says it is all not happening.