The approach of tracking usage locally and cutting off before you hit billing overages makes a lot more sense than trying to parse the billing API after the fact. Prevention over detection.
Could be cool to set per-worker limits in addition to the global ones.
This is more state. The deployed app is then more stateful and thus more complex. If there is more complexity, there are probably more failure cases.
But resource budget quota signals are a good feature, I think.
Apps should throttle down when approaching their resource quotas.
What is the service hosting provider running to scale the service up and down?
Autoscaling: https://en.wikipedia.org/wiki/Autoscaling
k8s ResourceQuotas: https://kubernetes.io/docs/concepts/policy/resource-quotas/
willswire/union is a Kubernetes Helm chart for self-hosting cloudflare/workerd: https://github.com/willswire/union
Helm docs > intro > Using Helm: https://helm.sh/docs/intro/using_helm/ :
> Helm installs resources in the following order:
> [..., ResourceQuota, ..., HorizontalPodAutoscaler, ...]
How could this signal and the messaging about the event be standardized in the Containerfile spec, k8s, Helm?
Containerfile already supports HEALTHCHECK. Should there be a QUOTACMD Dockerfile instruction to specify a command to run when passed a message with the quota status?
The gap: most platforms treat billing as purely financial. But spend limits are actually a form of resource isolation. When your Workers hit quota, you don't just lose money, you lose availability. Treating budget as a circuit breaker turns it into active defense.