Stop using low DNS TTLs(blog.apnic.net)

36 pointsby swills12 hours ago10 comments

tracker14 hours ago
I usually set mine to between an hour and a day, unless I'm planning to update/change them "soon" ... though I've been meaning to go from a /29 to /28 on my main server for a while, just been putting off switching all the domains/addresses over.
Maybe this weekend I'll finally get the energy up to just do it.
zamadatixan hour ago
I used to get more excited about this but even when browsers don't do a DNS prefetch (or even a complete preload) the latency for lookups is usually still so low on the list of performance impacting design decisions that it is unlikely to ever outweigh even the slightest advantages (or be worth correcting misperceived advantages) until we all switch to writing really really REALLY optimized web solutions.
Neywiny5 hours ago
I guess I'm not sure I understand the solution. I use a low value (idk 15 minutes maybe?) because I don't have a static ip and I don't want that to cause issues. It's just me to my home server so I'm not adding noticable traffic like a real company or something, but what am I supposed to do? Is there a way for me to send an update such that all online caches get updated without needing to wait for them to time out?
- viraptor4 hours ago
  For a private server with not many users this is mostly irrelevant. Use low ttl if you want to, since you're putting basically 0 load on the DNS system.
  > such that all online caches get updated
  There's no such thing. Apart from millions of dedicated caching servers, each end device will have it's own cache. You can't invalidate DNS entries at that scope.
gertop20 minutes ago
The irony here is that news.ycombinator.com has a 1 second TTL. One DNS query per page load and they don't care, yay!
deceptionatd6 hours ago
I have mine set low on some records because I want to be able to change the IP associated with specific RTMP endpoints if a provider goes down. The client software doesn't use multiple A records even if I provide them, so I can't use that approach; and I don't always have remote admin access to the systems in question so I can't just use straight IPs or a hostfile.
garciasn6 hours ago
Could it be because folks set it low for initial propagation and then never change it back after they set it up.
- fukawi26 hours ago
  That's not how TTL works. Or do you mean propagation after changing an existing RR?
  It's "common" to lower a TTL in preparation for a change to an existing RR, but you need to make sure you lower it at least as long as the current TTL prior to the change. Keeping the TTL low after the change isn't beneficial unless you're planning for the possibility of reverting the change.
  A low TTL on a new record will not speed propagation. Resolvers either have the new record cached or they don't. If it's cached, the TTL doesn't matter because it already has the record (propogated). If it doesn't have it cached, then it doesn't know the TTL so doesn't matter if it's 1 second or 1 month.
  - garciasn5 hours ago
    I meant both. Initial (which you say doesn't matter; TIL) and edits after-the-fact. I learned something new today and I've been doing DNS crap for decades; I feel like a doofus.
- deceptionatd6 hours ago
  Maybe, but I don't think TTL matters for speed of initial propagation. I do set it low when I first configure a website so I don't have to wait hours to correct a mistake I might not have noticed.
  - kevincox3 hours ago
    Yes. Statistically the most likely time to change a record is shortly after previously changing it. So it is a good idea to use a low TTL when you change it, then after a stability period raise the TTL as you are less likely to change it in the future.
effnorwood3 hours ago
Sometimes they need to be low if you use the values to send messages to people.
1970-01-015 hours ago
(2019)
- zamadatixan hour ago
  Discussed in 2022 (106 comments) https://news.ycombinator.com/item?id=33527642
  And a similar version of the same blog post on a personal blog in 2019 https://news.ycombinator.com/item?id=21436448 (thanks to ChrisArchitect for noting this in the only comment on a copy from 2024).
GuinansEyebrows4 hours ago
i was taught this as a matter of professional courtesy in my first job working for an ISP that did DNS hosting and ran its own DNS servers (15+ years ago). if you have a cutover scheduled, lower the TTL at $cutover_time - $current_ttl. then bring the TTL back up within a day or two in order to minimize DNS chatter. simple!
of course, as internet speeds increase and resources are cheaper to abuse, people lose sight of the downstream impacts of impatience and poor planning.
bjourne7 hours ago
I don't understand why the author doesn't consider load balancing and failover legitimate use cases for low ttl. Cause it wrecks their argument?
- kevincox3 hours ago
  Because unless your TTL is exceptionally long you will almost always have a sufficient supply of new users to balance. Basically you almost never need to move old users to a new target for balancing reasons. The natural churn of users over time is sufficient to deal with that.
  Failover is different and more of a concern, especially if the client doesn't respect multiple returned IPs.
  - johntash3 hours ago
    I'm assuming OP means cloud-based load balancers (listening on public ips). Some providers scale load balancers pretty often depending on traffic which can result in a set of new IPs.
- BitPirate6 hours ago
  Why do you need a low ttl for those? You can add multiple IPs to your A/AAAA records for very basic load balancing. And DNS is a pretty bad idea for any kind of failover. You can set a very low ttl, but providers might simply enforce a larger one.
  - toast06 hours ago
    You don't want to add too many A/AAAA records, or your response gets too big and you run into fun times. IIRC, you can do about 8 of each before you get to the magic 512 byte length (yeah, you're supposed to be able to do more, 1232 bytes as of 2020-10-01, but if you can fit in 512 bytes, you might have better results on a few networks that never got the memo)
    And then if you're dealing with browsers, they're not the best at trying everything, or they may wait a long time before trying another host if the first is non-responsive. For browsers and rotations that really do change, I like a 60 second TTL. If it's pretty stable most of the time, 15 minutes most of the time, and crank it down before intentional changes.
    If you've got a smart client that will get all the answers, and reasonably try them, then 5-60 minutes seems reasonable, depending on how often you make big changes.
    All that said, some caches will keep your records basically forever, and there's not much you can do about that. Just gotta live with it.
  - deceptionatd6 hours ago
    It's not good as a first line of defense for failover, but with some client software and/or failure mechanisms there aren't any better approachs I'm aware of. Some of the software I administer doesn't understand multiple A/AAAA records.
    And a BGP failure is a good example too. It doesn't matter how resilient the failover mechanisms for one IP are if the routing tables are wrong.
    Agreed about some providers enforcing a larger one, though. DNS propagation is wildly inconsistent.
  - Matheus286 hours ago
    If you add multiple IPs to a record, a lot of resolvers will simply use the first one. So even in that case you need a low TTL to shuffle them constantly
  - bjourne2 hours ago
    What if your site is slashdotted or HN hugged to death? Most requests will hit the IP of the first record, while the others idle.
- Bender6 hours ago
  Perhaps as most these days are using Anycast [1] to do failovers. It's faster and not subject to all the oddities that come with every application having its own interpretation of DNS RFC's most notably java and all its work-arounds that people may or may not be using and all the assorted recursive cache servers that also have their own quirks thus making Anycast a more reliable and predictable choice.
  [1] - https://en.wikipedia.org/wiki/Anycast
- c45y6 hours ago
  Probably an expectation for floating IPs for load balancing instead of DNS.
  Relatively simple inside a network range you control but no idea how that works across different networks in geographical redundant setups
  - deceptionatd6 hours ago
    Agreed; I have no idea how you'd implement that across multiple ASNs, which is definitely a requirement for multi-cloud or geo-redundant architectures.
    Seems like you'd be trying to work against the basic design principles of Internet routing at that point.
  - preisschild6 hours ago
    Anycast pretty much