Yes nuclear option, but I’ll take an hour down time over a $100k unexpected bill
If they weren't turned off at the billing cap, but were given some leeway instead, either that becomes the new hard limit, or GCP will have to give away the difference.
And there's no "middle ground" you could implement that makes sense either - like a "frozen" state. Preventing new writes to a GCS bucket breaks the writer app. Freezing VMs serving web traffic takes the site down.
Even if the service was shut down once the billing limit is hit, how long would GCP wait for the user to add funds or raise the limit? GCP would need to either keep the services in a hidden/frozen state or not turn them off / freeze them at all (in which case GCP would be giving away resources for free).
Maybe GCP can give users a heads-up when they're about to hit the limit? GCP already does - billing alerts do exist. It's just possible to blow past them if your usage is a massive spike.
Moreover, getting the hundreds of GCP services to implement a "frozen" state is difficult. It's hard enough getting everyone to listen to the "billing account disabled" signal, and (soft-)delete the resources (based on the resource, after some time interval). Given these billing overruns happen for smaller customers, it's not really worth solving the problem - which I don't think has a great solution to begin with.
There is better option here, but google won't as it will hurt their incentive to take more money from customer.
I understand that $18k is probably a drop in the bucket, but surely there's a middle ground here.
Things like this are the exact reason that companies end up having to comply with all kinds of regulations. It's just easier to screw the customer first.
Predation, pure and simple.
Google doesn't care, no one is holding them responsible for predatory behavior. It's profitable to steal from your customers and there's no downside to doing so.
You'll all keep using them either way.
With that said, when you go to set a budget it warns you "Setting a budget does not cap resource or API consumption. Learn more." with a hyperlink to https://docs.cloud.google.com/billing/docs/how-to/budgets?_g...
The fact that google redefine what budget means and put a warning doesn't make it ok.
* By clicking here you agree to kill it
And you're defending that?
The article, and the comment I was replying to, make it seem like an error in the Google Budget system. I'm simply trying to say this system is working as designed and documented.
If I, not having their budget or engineers, can have pretty much instant Prometheus event reacting to metrics, surely it wouldn't be too hard for them to have triggers like this -- somehow their AI can automatically ban people based on something, can't they do something for the customers?
They can, just don't want to.
And the system automatically upgraded them to higher spending limits when they crossed the $1000 in usage costs.
They could definitely make that an opt-in feature.
Also, if implementing a cap is a desired feature that justifies trade-offs to be made, then it is psosible to translate the budget cap (in terms of money) back into service-specific caps that are easier to keep consistent. Such as "autoscale this set of VMs" and "my budget cap is $1000/hour", with the VM type being priced at $10/hour, translated to "autoscale to at most 100 instances". That would need dev work (i.e. this feature being considered important) and would not respect the budget cap in a cross-service way automatically, but still it is another piece in the puzzle.
But a big part of the value in large clouds like GCP is the network's interconnectedness. Plus even if there was some global event that made communications impossible only for the billing service, I'd still expect charges to top out roughly proportional to the number of partitions as they each independently exceed the threshold. GCP only has 120ish zones.
Deleting those when a customer hits a limit will lose customer data or remove things that might be hard to add back. The "I hit my AWS limit and they deleted all my data" headlines will result.
and excluding those things makes the limit soft again..
(Generally, tech seems to skate by on creating insanely complicated things, knowing that given enough pain, people will start blogging about their solutions, ie effectively outsourcing the cost and effort of doing something about it.)