There is a YC company called Arpio [0] that does this sort of thing as a service. It can replicate a ton of stuff beyond what Backup does (it also uses Backup for certain things from what I remember). It works as advertised and for most companies is probably worth it vs doing this yourself. I am not affiliated, just worked with it at a customer.
Backup to S3, use the above to copy it elsewhere.
Moreover, AWS Backup is the _Terraform_ of backup in AWS. You can control all your backups through a single interface, with various policies (scheduling, retention, access...)
For instance, by default, you are limited to 100 Manual RDS Snapshots per account. With AWS Backup, you can do what you want. You can define dozens of different rules for the same services/resources.
So you can let teams manage their resources as they want, and have a backup team manage backuping everything from AWS Backup without having to interact with the services/resources themselves.
Even if all your apps and data stores are active-active multi-region you can be in a world of risk with no DR for a long time if your DR region fails. If your data size is small that vulnerability window might be small but if you’ve got petabytes you’ll be without lifeboat for a days or weeks until you can take another “full” DR copy.
> then the country is probably under attack, and absolutely no one will give a shit that your SaaS product is dead.
Or there’s a severe natural disaster, or a flooded data center due to unforeseen conditions, or any number of things.
If your country is attacked, all business does not immediately halt. War is not an instantaneous phenomenon where an entire country becomes destroyed overnight. People continue living their lives as best they can because they still need to put food on the table and life must go on. I have a number of friends and past coworkers in Ukraine who can attest to how you continue doing your best and pick up the pieces and continue moving back toward normalcy.
GCP, IAM (global; just like a week and a half ago!)
GCP, VMs etc. (regional!¹)
Azure, application GW (global)
Cloudflare (global)
Azure, IAM (global)
Azure, IAM (global)
You can tell IAM is a point of weakness. (As it kinda must be.)¹though I wasn't affected by this one, as it was in Europe.
Note that even the intended configuration change was designed to be Regional, not just limited to one AZ.
AWS's definitions for AZ & Regions are by far the strongest in the industry.
GCP has AZ in the same physical complex. Azure Regions would be AZ's under AWS's definition.
If I go waaaaay back (like mid 2010s), we did have an S3 outage. It was regional, even!
> GCP has AZ in the same physical complex.
I can't say if that's correct or not; GCP says,
> Zones should be considered a single failure domain within a region. To deploy fault-tolerant applications with high availability and help protect against unexpected failures, deploy your applications across multiple zones in a region.
That's an AZ, to me. (Or, alternatively & synonymously, a failure domain.)
¹IME over my career, though, AWS is fairly stable. GCP is too. AWS has its foibles, though. When last I worked with RDS (circa 2019), there were bugs.
If you document and drill an cross-region recovery, in *most* (not all) cases you will be able to more confidently predict when things are going to be running, you'll know what information is there and what isn't and can build processes to communicate expectations to customers and/or regulators.
There’s also benefits for many apps to be closer to the customer. If you’re building out infrastructure in a remote region for that purpose, the marginal cost of getting more out of it may be compelling.