fundamentals
Caching and TTL
How DNS responses are cached across multiple layers to speed up the internet
The migration that took 48 hours instead of 5 minutes
A team finishes migrating their application to a new cloud provider. The new servers are running, load balancers are healthy, and they update the DNS A record to point to the new IP address. Five minutes later, they check — the domain still resolves to the old IP. An hour later, same thing. They open a ticket with their DNS provider, who confirms the record is correct. Eight hours in, some users can reach the new servers but others cannot. The full cutover does not complete until nearly two days later.
Nothing was broken. The team simply did not lower their TTL before the migration. Their A record had a 24-hour TTL, meaning every resolver that had cached the old record would continue serving it for up to 24 hours after the change. Some browsers added their own caching on top. The result: a “5-minute change” that took 48 hours to fully take effect.
Understanding DNS caching and TTL is the difference between a smooth migration and a day-long partial outage.
Why caching matters
Without caching, every DNS lookup would require a full traversal of the DNS hierarchy — root server, TLD server, and authoritative server — adding 50-120 ms or more to every new connection. At the scale of the modern internet (Akamai alone handles approximately 7 trillion DNS requests per day), this would be unsustainable. The root and TLD servers would be overwhelmed, and every page load would feel noticeably slower.
Caching solves this by storing previous answers at multiple layers. A query that takes 80 ms the first time resolves in 1-5 ms on subsequent lookups. For popular domains, the cache hit rate at major recursive resolvers is extremely high — the vast majority of queries never leave the resolver.
The caching hierarchy
DNS responses are cached at four layers, from closest to the user to farthest:
+-------------------+
| 1. Browser Cache | Fastest: in-process memory
+-------------------+
|
+-------------------+
| 2. OS Stub Cache | System-wide shared cache
+-------------------+
|
+-------------------+
| 3. Router Cache | Local network level (optional)
+-------------------+
|
+-------------------+
| 4. Recursive | ISP or public resolver
| Resolver Cache | Serves many clients; highest hit rate
+-------------------+ Each layer checks its cache before forwarding the query upward. A cache hit at any layer means the query stops there — no further network traffic is needed.
Layer 1: Browser cache
Your browser maintains its own DNS cache, separate from the operating system. This is the fastest layer — lookups resolve in microseconds from in-process memory.
| Browser | Cache Size | Minimum TTL | Behavior |
|---|---|---|---|
| Chrome / Chromium | Up to 1,000 entries | 60 seconds | When using the OS resolver (getaddrinfo()), all entries are cached for exactly 60 seconds regardless of the authoritative TTL, because the OS API does not return TTL information. When using Chrome’s built-in DoH resolver, it respects the DNS TTL with a 60-second minimum. |
| Firefox | Up to 400 entries | 60 seconds | Caches for 60-120 seconds. After 60 seconds, serves stale results while asynchronously refreshing in the background, then extends the cache for another 2 minutes. |
| Safari | Not publicly documented | Respects TTL | Follows the authoritative TTL value. |
| Edge | Same as Chrome | 60 seconds | Chromium-based; identical behavior to Chrome. |
A practical consequence: even if you set a TTL of 30 seconds on your DNS record, Chrome and Firefox will cache the result for at least 60 seconds. This browser-level floor means that sub-minute DNS changes are not fully effective for browser-based traffic.
You can inspect Chrome’s DNS cache by visiting chrome://net-internals/#dns in the address bar.
Layer 2: Operating system cache
The OS maintains a system-wide DNS cache shared by all applications.
| OS | Resolver Service | Default Behavior |
|---|---|---|
| macOS | mDNSResponder | Caches DNS responses; respects TTL |
| Windows | DNS Client service | Caches up to 1,000 entries; max TTL honored: 86,400 seconds (1 day) |
| Linux | systemd-resolved (if configured) | Many Linux systems do not cache DNS at the OS level by default unless systemd-resolved or nscd is running |
To flush the OS cache when testing DNS changes:
# macOS
sudo dscacheutil -flushcache && sudo killall -HUP mDNSResponder
# Windows
ipconfig /flushdns
# Linux (systemd-resolved)
sudo systemd-resolve --flush-caches Layer 3: Router / local network cache
Home routers and enterprise DNS appliances often run a lightweight caching resolver such as dnsmasq. This layer is optional — its presence depends on the hardware and firmware configuration. When present, it caches responses for all devices on the local network, which is particularly useful when multiple devices query the same domains.
Layer 4: Recursive resolver cache
This is the most impactful cache layer. Recursive resolvers like Google Public DNS (8.8.8.8), Cloudflare (1.1.1.1), Quad9 (9.9.9.9), or your ISP’s resolver serve thousands to millions of clients. The probability that someone has recently queried for the same domain is very high, which produces excellent cache hit rates for popular domains.
| Public Resolver | IP Addresses | Provider |
|---|---|---|
| Google Public DNS | 8.8.8.8, 8.8.4.4 | |
| Cloudflare DNS | 1.1.1.1, 1.0.0.1 | Cloudflare |
| OpenDNS | 208.67.222.222, 208.67.220.220 | Cisco |
| Quad9 | 9.9.9.9, 149.112.112.112 | Quad9 Foundation |
TTL: Time to Live
TTL is a value in seconds included in every DNS response that tells each caching layer how long the record may be stored before it must be discarded and re-queried.
How TTL works, step by step
- The authoritative server returns a record with a TTL:
example.com. 3600 IN A 93.184.216.34 - The recursive resolver caches the record and starts a countdown from 3,600 seconds.
- Any queries during those 3,600 seconds receive the cached answer immediately.
- The resolver passes the remaining TTL (not the original) to downstream clients. If 1,000 seconds have elapsed, the client receives a TTL of 2,600.
- After 3,600 seconds, the cached entry expires and the next query triggers a fresh lookup from the authoritative server.
This countdown behavior is important: the TTL you see in a dig response is the remaining cache time at that resolver, not necessarily the authoritative value. To see the original TTL, query the authoritative server directly:
# Query the resolver (shows remaining TTL)
dig example.com
;; ANSWER SECTION:
example.com. 2847 IN A 93.184.216.34
# Query the authoritative server directly (shows original TTL)
dig example.com @a.iana-servers.net
;; ANSWER SECTION:
example.com. 86400 IN A 93.184.216.34 Common TTL values
| TTL | Seconds | Typical Use Case |
|---|---|---|
| 30 seconds | 30 | Minimum practical TTL; DNS migrations or active failover scenarios |
| 5 minutes | 300 | Frequently changing records; Cloudflare’s default for proxied records |
| 15 minutes | 900 | Dynamic environments; pre-migration preparation |
| 1 hour | 3,600 | Standard for most A/AAAA/CNAME records in stable production |
| 4 hours | 14,400 | Moderate stability; common default for many DNS providers |
| 12 hours | 43,200 | Stable records that change occasionally |
| 24 hours | 86,400 | Static records (NS, MX, TXT); industry-recommended default |
| 1 week | 604,800 | Maximum practical TTL; very stable infrastructure records |
TTL best practices
The right TTL depends on your operational needs. Higher TTLs improve cache efficiency and reduce load on authoritative servers. Lower TTLs allow faster changes but increase query volume.
| Scenario | Recommended TTL | Rationale |
|---|---|---|
| Normal operations (stable) | 3,600 - 86,400 s | Balance between caching efficiency and change responsiveness |
| Pre-migration (24-48h before change) | 300 - 600 s | Lower the TTL so caches expire quickly when the change happens |
| During migration | 60 - 300 s | Minimum practical TTL for rapid failover |
| Post-migration (after confirming success) | 3,600 - 86,400 s | Restore caching efficiency |
| CDN / load balancer records | 60 - 300 s | IPs change frequently for traffic management |
| MX records | 3,600 - 86,400 s | Mail servers rarely change; longer TTLs reduce lookup overhead |
| NS records | 3,600 - 86,400 s | Name servers are critical infrastructure; should be stable |
The migration pattern is worth emphasizing: always lower TTL before making changes, not at the same time. If your current TTL is 24 hours and you lower it simultaneously with changing the IP address, resolvers that cached the old record up to 24 hours ago will continue serving it until their cache expires. You need the low TTL to be in effect before the change so that caches hold only short-lived copies of the old value.
Negative caching
Caching is not just for successful responses. When a DNS query results in NXDOMAIN (the name does not exist) or NODATA (the name exists but has no records of the queried type), resolvers cache this negative result too.
The TTL for a negative cache entry is the minimum of:
- The TTL on the SOA record returned in the authority section of the response
- The Minimum TTL field within the SOA record’s RDATA (the last field in the SOA record)
Negative caching prevents resolvers from repeatedly querying for names that do not exist, which would waste bandwidth and increase load on authoritative servers. However, it also means that if you create a new subdomain, resolvers that recently got an NXDOMAIN for that name will continue returning the negative result until the negative cache entry expires.
# Query a non-existent subdomain
dig nonexistent.example.com
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN
;; AUTHORITY SECTION:
example.com. 86400 IN SOA ns1.example.com. admin.example.com. (
2025021301 3600 900 1209600 86400 ) The SOA record’s minimum TTL (86,400 seconds in this example) means the NXDOMAIN result will be cached for up to 24 hours. If you then create nonexistent.example.com, some resolvers will not see it until their negative cache entry expires. This is why some operators use a lower SOA minimum TTL (e.g., 300 or 900 seconds) in zones where new subdomains are frequently created.
“DNS propagation” is actually cache expiration
The term “DNS propagation” is one of the most persistent misnomers in web development. DNS changes do not actively propagate, push, or replicate across the internet. The authoritative server updates immediately. What takes time is cache expiration.
Here is what actually happens when you change a DNS record:
- You update the A record on the authoritative server. The change is live instantly on that server.
- Recursive resolvers worldwide still have the old record cached from previous lookups.
- Each resolver’s cached copy has a TTL countdown ticking. When it reaches zero, the resolver discards the old record.
- The next query after expiration triggers a fresh lookup, which fetches the new record from the authoritative server.
- After one full TTL cycle (worst case), all resolvers have the new data.
There is no “propagation network.” There is no replication protocol between resolvers. Each resolver independently expires its cache and fetches fresh data on the next query.
This is why TTL management matters so much. A 24-hour TTL means a worst-case wait of 24 hours. A 5-minute TTL means a worst-case wait of 5 minutes. And because different resolvers cached the record at different times, you will see a rolling transition where some users get the new IP and others still get the old one — the resolver they happen to use determines when they see the change.
Browser-specific caching quirks
The browser caching layer introduces behavior that surprises many developers:
Chrome’s 60-second floor. When Chrome uses the OS resolver (which is the default unless DNS-over-HTTPS is enabled), it caches every DNS result for exactly 60 seconds regardless of the authoritative TTL. This is because the getaddrinfo() system call does not return TTL information, so Chrome defaults to 60 seconds. Even a TTL of 0 results in 60 seconds of caching.
Firefox’s stale-while-revalidate. Firefox caches DNS results for 60 seconds, then continues serving the stale result for up to another 60 seconds while it refreshes the record asynchronously in the background. This means a DNS change can take up to 120 seconds to take effect in Firefox, even with a very low TTL.
Hard refresh does not clear DNS cache. Pressing Ctrl+Shift+R (hard refresh) clears the browser’s HTTP cache but does not clear the DNS cache. To clear Chrome’s DNS cache, you must visit chrome://net-internals/#dns and click “Clear host cache.”
Incognito/private windows share the DNS cache. Opening an incognito window does not give you a fresh DNS cache. The browser’s DNS cache is process-wide, not tied to browsing sessions.
Checking TTL and cache status
A few useful commands for diagnosing caching behavior:
# Check the TTL of a record (remaining time at your resolver)
dig example.com
# Check the original TTL from the authoritative server
dig example.com @a.iana-servers.net
# See which name servers are authoritative
dig NS example.com
# Query a specific public resolver to compare
dig example.com @8.8.8.8
dig example.com @1.1.1.1 If dig @8.8.8.8 returns the new IP but dig @1.1.1.1 returns the old one, it means Google’s resolver has already expired its cache and fetched the update, while Cloudflare’s resolver still has the old record cached. Both are correct — they simply cached the record at different times.
Summary: the TTL decision framework
When setting TTL values, consider three factors:
How often does this record change? Records that never change (NS, MX for established mail servers) benefit from high TTLs (12-24 hours). Records that change regularly (CDN endpoints, load balancer IPs) need low TTLs (1-5 minutes).
What is the cost of serving stale data? If a stale record means a brief slowdown, longer TTLs are acceptable. If it means a complete outage (e.g., post-migration, the old IP is no longer serving traffic), you need short TTLs.
What is the query volume? High-traffic domains with low TTLs generate enormous query volume on authoritative servers. Balance speed-of-change against infrastructure cost.
And always remember the migration rule: lower TTL first, wait one full TTL cycle, then make the change.