Looking at the relevant limit, "Consecutive Authorization Failures per Hostname per Account"[0], it looks like there's no way to hit that specific limit if you only run once per day.
Ah, to think how many cronjobs are out there running certbot on * * * * *!
[0]: https://letsencrypt.org/docs/rate-limits/#consecutive-author...
It's very under-engineered, maybe a trifold pamphlet on light A11 printed with a laser jet running out of ink.
I've probably spent more time talking about how much it sucks than I have bothered considering a proper solution, at this point.
I respect this. Reading someone else write this makes me feel more comfortable thinking about the things in my life I could be doing more to improve, which makes me respect this even more.
I hope they don't go any shorter than a month. Let the user pick, any value up to a year should do.
Incidentally the fact that it took them 4 days to respond to that issue is why I'll be wary of getting 6-day certs for them. The only reason it wasn't a problem there was that it was a 30d cert and had plenty of time remaining, so I was in no rush. (Also ideally they'd have a better support channel than an open forum where an idiot "Community Leader" who doesn't know what he's talking about wastes your time, as happened in that thread.)
To make 24 hour valid certs practical you would need to generate them ahead of time and locally switch them out. This would be a lot more reliable if systems supported two certs with 50% overlapping validity periods at the same time.
(90 days will remain the default though)
SES? Around $0.0001 per e-mail
https://letsencrypt.org/2025/01/22/ending-expiration-emails/
The only email they're keeping are mailing lists which you need to subscribe to seperately which are presumably run by an external provider.
3% and "3,200 people manually unpaused issuance" does seem much higher than expected to me and no cause for celebration, especially at this scale.
Are there no better patterns to be exploited to identify 'zombies'? Running experiments with blocking and then unblocking to validate should work here.
I guess this falls into the bucket of: sure we can do that, given sufficient time and resources
I understood a zombie to represent a client that is dead and will never come back to live again. Since they came back to live they were not actually zombies. So manual action from actually alive clients was required. That may be ok, since they behavior was not acceptable, but in the spirit of not penalizing it would be better to not block those clients if they can be identified and sufficient resources are available to shoulder their misbehaviour.
> The pause may have simply been the reason that someone became aware there was even a problem.
I didn't take that into account and it would be neat. But why would they become aware after this change? Because the error message(/code?) is now different?
If this is the error that you're getting, then hitting unpause won't make the certificate requests start working. You'll just go back to receiving the persistent error messages from before the pause.
What do you gain by automating it? This isn't an error that you'll experience in day-to-day successful operation. It's not an error that reoccurs after resolution because it can be removed for years with one action. This lock will only happen if a cert request is consistently broken for a really long time.
Fixing the underlying cause of the cert issuance failures requires human intervention anyway, a human can easily click the button. They also provide first-class support for bulk enablement.
The motivations for automating button are extremely small.