Page MenuHomePhabricator

Refactor public-facing DYNA scheme for primary project hostnames in our DNS
Closed, ResolvedPublic

Description

This came up in a conversation with @faidon a couple of weeks ago. Our current scheme is a direct DYNA record for every public service hostname, which means separate A-records with TTL=600 (more on reducing that in T140365 , which I think this will change a little).

So a fresh lookup on any given service hostname always returns distinct RRs for caches, with examples like:

en.wikipedia.org 600 A 192.0.2.1 ; +ednsc-splitting in all such caches
fr.wikipedia.org 600 A 192.0.2.1 ; +ednsc-splitting in all such caches
fr.wiktionary.org 600 A 192.0.2.1 ; +ednsc-splitting in all such caches

For domainnames which are very popular, such as en.wikipedia.org, this scheme works out fine, and large DNS caches tend to keep it hot. However, for less-used languages and/or projects (we have something on the order of ~3000 such combinations possible), it's not so efficient, as they're not likely to be re-used from many DNS caches before they expire.

Using CNAMEs with longer TTLs pointing at a central short-TTL lb record may be more-efficient in the net for all of the other names. In such a scheme we might do something like:

en.wikipedia.org 86400 CNAME text-lb.wikiMedia.org ; globally-static, no ednsc-splitting
fr.wikipedia.org 86400 CNAME text-lb.wikiMedia.org ; globally-static, no ednsc-splitting
fr.wikitionary.org 86400 CNAME text-lb.wikiMedia.org ; globally-static, no ednsc-splitting
text-lb.wikiMedia.org 600 A 192.0.2.1 ; +ednsc-splitting in all such caches

or:
en.wikipedia.org 86400 CNAME text-lb.wikiPedia.org ; globally-static, no ednsc-splitting
fr.wikipedia.org 86400 CNAME text-lb.wikiPedia.org ; globally-static, no ednsc-splitting
fr.wikitionary.org 86400 CNAME text-lb.wikiPedia.org ; globally-static, no ednsc-splitting
text-lb.wikiPedia.org 600 A 192.0.2.1 ; +ednsc-splitting in all such caches

(which is potentially more-efficient for the most-popular-by-far project domain, at the cost of perhaps some pedants complaining about e.g. wikivoyage traffic mentioning the wikipedia.org domainname at this low level of technical matters).

There's some tricky tradeoffs to work through about different DNS cache scenarios:

  • private vs widely-shared caches
  • how this affects cache splits on edns-client-subnet
  • whether the gains for less-popular hostnames (among clients of a given cache) sufficiently offset the possible minor/rare regression for the more-popular
  • what kind of role regional patterns of hostname popularity play.
  • How we prioritize the p50 vs p99 kinds of tradeoffs here, across all global domains/traffic.

There are fewer tradeoffs with the second scheme where the canonical LB name lives in the most-popular project's domain.

[Note: There was a time in the distant past when we also used a CNAME-based scheme, but this scheme above differs from the old one]

Event Timeline

BBlack triaged this task as Medium priority.Oct 29 2018, 8:01 PM
BBlack created this task.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

There's some complexities here that I've been stewing on for a while, mostly noted in the original description, but I like this general direction. Most of the concerns briefly mentioned earlier aren't actually a big deal in practice, but there remains a key issue around CNAME + edns-client-subnet, and the decision between putting the terminal DYNA record in either wikipedia.org or some other domain (preferably one not used by current canonicals at all, e.g. maybe this variant would be a good use for wikimedia.net?). Where I'm at now in thinking on these two paths:

  • A: The wikimedia.net -style solution (CNAME target for all projects' canonicals is a DYNA somewhere in wikimedia.net or another domain that's not canonical for projects):
    • Pros:
      • Because all the CNAMEs are cross-zone, there are no complex issues with edns-client-subnet.
    • Cons:
      • All lookups, including for wikipedia.org (special because of its outsized traffic level), will cross domain boundaries while following the CNAME, which possibly incurs some additional lookup time for cold caches (because they have to find the NS records for two different 2LDs in the process), but with long CNAME TTLs the impact should be pretty minimal (but definitely non-zero).
  • B: The wikipedia.org -style solution (CNAME target for all projects' canonicals is a DYNA somewhere in wikipedia.org):
    • Pros:
      • More efficient for everyone (wikipedia and others) under cold-cache conditions because of the NS delegation staying the same for wikipedia.org
    • Cons:
      • edns-client-subnet behavior is sub-optimal for wikipedia hostnames with current DNS implementation, because the client-subnet-splitting will be attached to a singular chained response containing both of e.g. en.wikipedia.org CNAME text-lb.wikipedia.org and text-lb.wikipedia.org A ....), basically negating a lot of the intended benefit.

gdnsd-3.1.0 will have an experimental_no_chain option which eliminates the only downside of option B, making it the best option. However, we still need to deploy that code and test it in the wild to figure out if it causes any issues with real live recursors on the Internet (it shouldn't, but there is a reason the option is experimental!) before we can try a naming scheme that relies on it for optimality.

What I'd like to do at this point is try that experimentation first, and if it's a success take option B, and if it's a failure take option A.

Change 500731 had a related patch set uploaded (by BBlack; owner: BBlack):
[operations/dns@master] Turn on non-chaining CNAMEs experimental option

https://gerrit.wikimedia.org/r/500731

Change 500731 merged by BBlack:
[operations/dns@master] Turn on non-chaining CNAMEs experimental option

https://gerrit.wikimedia.org/r/500731

Change 501628 had a related patch set uploaded (by BBlack; owner: BBlack):
[operations/dns@master] Add CNAME-variant langlist template

https://gerrit.wikimedia.org/r/501628

Change 501629 had a related patch set uploaded (by BBlack; owner: BBlack):
[operations/dns@master] wiktionary: test with zone-local CNAME->DYNA

https://gerrit.wikimedia.org/r/501629

Krinkle added a subscriber: Krinkle.

(Adding to our radar to look at navtiming/dns metrics impact after it is rolled out.)

We may try the wiktionary patch early next week. The goal with that test is just to see if we get any user complaints about wiktionary.org resolution being broken, so we'll leave it in place for a week or so if we don't get complaints, or revert if we do. Either way it will eventually get reverted, and if it's successful then we'll start patching for the "real" version where everything centralizes into a wikipedia.org hostname, so that's probably still at least a couple weeks out.

The wiktionary CNAME experiment is going out today, and I'm intending to keep it running for at least a week, assuming no issues arise.

For SRE, the important TL;DR is:

  • There's a small chance a small number of DNS recursors on the Internet won't understand what we're doing in this experiment and fail users trying to resolve wiki hostnames in wiktionary.org.
  • If we get a reliable wiktionary DNS breakage report (or multiple unreliable ones!), please revert https://gerrit.wikimedia.org/r/c/operations/dns/+/501629 and push the revert via the usual authdns-update mechanism. This should fix the issue in 5 minutes or less for affected users.

The deep part of this is that what's really being tested isn't the CNAME itself in that patch (which is pretty legitimate-looking). An experimental option is already enabled in our AuthDNS servers which can potentially confuse broken DNS recursors when looking up any zone-local CNAME chain served by our authservers. Without the wiktionary change, no major production wiki hostnames are actually relying on zone-local CNAMEs, so this change presents the experiment to a much broader range of users. The experimental behavior is that the authserver won't complete the CNAME chain on behalf of the recursor, requiring a cold recursor to ask us two separate questions to complete the chain and resolve the address. This is known to work with major open source resolvers and all the major public shared resolvers, but there's a small real chance there are real users behind some legacy/broken resolvers for which this all falls apart. An existing example of this behavioral change would be:

Old standard behavior, returning a complete CNAME chain within zone-local data in one query:

blblack@dallas:~$ dig @ns0.wikimedia.org icinga.wikimedia.org
[...]
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 41740
;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
[...]
;; ANSWER SECTION:
icinga.wikimedia.org.   300     IN      CNAME   icinga1001.wikimedia.org.
icinga1001.wikimedia.org. 3600  IN      A       208.80.154.84

New (current) behavior with the experimental CNAME stuff running, requiring two separate queries from a cold-cache state:

blblack@dallas:~$ dig @ns0.wikimedia.org icinga.wikimedia.org
[...]
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 44324
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
[...]
;; ANSWER SECTION:
icinga.wikimedia.org.   300     IN      CNAME   icinga1001.wikimedia.org.

blblack@dallas:~$ dig @ns0.wikimedia.org icinga1001.wikimedia.org A
[...]
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26685
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
[...]
;; ANSWER SECTION:
icinga1001.wikimedia.org. 3600  IN      A       208.80.154.84

Change 501628 merged by BBlack:
[operations/dns@master] Add CNAME-variant langlist template

https://gerrit.wikimedia.org/r/501628

Change 501629 merged by BBlack:
[operations/dns@master] wiktionary: test with zone-local CNAME->DYNA

https://gerrit.wikimedia.org/r/501629

Change 504587 had a related patch set uploaded (by BBlack; owner: BBlack):
[operations/dns@master] Revert "wiktionary: test with zone-local CNAME->DYNA"

https://gerrit.wikimedia.org/r/504587

Change 504588 had a related patch set uploaded (by BBlack; owner: BBlack):
[operations/dns@master] wikipedia.org: test with zone-local CNAME->DYNA

https://gerrit.wikimedia.org/r/504588

Change 504587 merged by BBlack:
[operations/dns@master] Revert "wiktionary: test with zone-local CNAME->DYNA"

https://gerrit.wikimedia.org/r/504587

Change 504588 merged by BBlack:
[operations/dns@master] wikipedia.org: test with zone-local CNAME->DYNA

https://gerrit.wikimedia.org/r/504588

Mentioned in SAL (#wikimedia-operations) [2019-04-18T16:20:42Z] <bblack> Experimental DNS-level changes deploying for wikipedia.org domain - if wikipedia.org DNS problems appear, revert https://gerrit.wikimedia.org/r/c/operations/dns/+/504588 - T208263

Change 505249 had a related patch set uploaded (by BBlack; owner: BBlack):
[operations/dns@master] wikipedia.org CNAME experiment: 4H CNAMEs

https://gerrit.wikimedia.org/r/505249

Change 505249 merged by BBlack:
[operations/dns@master] wikipedia.org CNAME experiment: 4H CNAMEs

https://gerrit.wikimedia.org/r/505249

Status update on the experiments above:

  • No known reports or evidence of resolution failures so far, which is the big important thing!
  • The rationale for pointing at the domain root as opposed to a "text-lb" sort of name is that the domain roots are the one case we can't fix with a CNAME, and the wikipedia.org zone root is probably a common-enough lookup even though it's not a true wiki hostname.
  • With just wikipedia.org (and its symlinked very-low-traffic alias domains) under experiment, during the first day at DYNA and CNAME TTLs of 600 and 600, the query rate change was small enough as to be difficult to measure (gains roughly cancelling out losses) whereas with a 4H TTL on the CNAMEs the query rate reduction is ~23%. The other projects and less-popular langs form a very long tail, so it's hard to predict what the final query rate reduction will be, but it should be greater than what we're seeing now in any case.
  • Wikipedia.org test has only been live for ~2 days, will keep going for a full week before declaring this fairly successful.
  • For the final production version of this scheme, current thinking is:
    • Point all the project/language CNAMEs for all the canonical domains explicitly at wikipedia.org. with a TTL of 1 day.
    • Leave all the canonical zone roots as DYNA, because there's no other choice there (can't CNAME at zone root).
    • Because the wikipedia.org zone root has other non-address records (MX, CAA, TXT for SPF/DKIM, etc), there will be some minority cases that can't use this scheme: these are the cases where today we have a DYNA to text-addrs, but it sits alongside other records which don't match up with our common set at the wikipedia.org zone root. There are only 2 such examples in our current data (mechanically verified!):
      • phabricator.wikimedia.org - Has a unique SPF record, so must remain an independent DYNA entry.
      • benefactors.wikimedia.org - Same as above
  • The upload-addrs case is special and different from text-addrs: there are only two hostnames that use it and no zone roots, so we'll leave it as-is for upload.wikimedia.org and just make maps.wikimedia.org a 1-day CNAME to upload.wikimedia.org for some slight further gains.

I am not sure if it is influential, but I still have to report it.

I'm from mainland China. As we all know, the Chinese government uses a system to block websites that they don't want people to view, including technology for domain name server pollution. It has been known that the wikipedia.org domain name of 'zh' and 'ja' has been polluted, but 'en' and so on wasn't affected. In April 23, someone in our community has found that the entire wikipedia.org domain has been polluted, then someone found this project.

Here are some ideas. Because the technology of that system also includes TCP Reset with SNI, our community has an unstable bypass method that uses a non-polluted domain name (such as 'en') to establish a TLS connection, and then uses the connection session cache to continue accessing Sites that polluted domain names. If the root domain name which includes the sub-domain of blocked and non-blocked use this way, maybe the system thinks that the root domain need to be blocked, which that is the status quo.

Since that system is a black box, all behavioral principles are based on guesswork. I am only reporting the phenomenon that our community has encountered. At least I have the technology to avoid these problems, such as using SNIproxy relay to access a completely blocked site, using a self-built domain name server to get non-polluted domain names. I hope this performance improvement makes more sense.

Change 505896 had a related patch set uploaded (by CDanis; owner: CDanis):
[operations/dns@master] Revert "wikipedia.org CNAME experiment: 4H CNAMEs"

https://gerrit.wikimedia.org/r/505896

Change 505896 merged by CDanis:
[operations/dns@master] Revert "wikipedia.org CNAME experiment: 4H CNAMEs"

https://gerrit.wikimedia.org/r/505896

Change 505901 had a related patch set uploaded (by CDanis; owner: CDanis):
[operations/dns@master] Revert "wikipedia.org: test with zone-local CNAME->DYNA"

https://gerrit.wikimedia.org/r/505901

Change 505901 merged by CDanis:
[operations/dns@master] Revert "wikipedia.org: test with zone-local CNAME->DYNA"

https://gerrit.wikimedia.org/r/505901

authdns-update complete as of ~20:33:56 UTC.

@Cwek Thank you very much for the detailed report! I've rolled back the experimental change to our DNS records, and by now, more than enough time should have passed for the TTLs to expire on the records that seemed to cause inaccessibility. Hopefully this will rectify things.

This comment was removed by Cwek.

@CDanis Thanks your help, but it seems the side effect may have formed. I extracted some subdomains of wikipedia.org and queried them on some well-known public domain name servers, including mainland China and outside of China, all of which still are polluted.

The deployment was extended on April 20th, but our community found problems on April 23, which may be too late.

I hope time can wash everything. Maybe. - -

@Cwek - Thanks for the reports! Have you tried other Wikimedia projects (e.g. wikiversity, wikiquote, wiktionary, etc) for SNI testing and/or DNS lookups from within? That may provide some level of insight as well. Currently we suspect the DNS changes here were not related to the new blockage, but obviously we'd like to gather all the data we can. The initial deployment date of the structural change was actually Apr 18th; the changes on the 20th merely extended the TTLs of that scheme from 10 minutes to 4 hours. Our own analytics seems to confirm that the dropoff of CN traffic was actually on the 23rd (same as when the community noticed).

@BBlack You can read this Help:如何访问维基百科

Just ‘zh.wikinews.org’ is polluted and no reset, it seems the 'wikinews.org' domain is good, except 'zh'.
Other domains seems still be good yet.

This is what I want say but didn't have said yet, because I was doubt.

BTW. My idea is that a domain name that is polluted by the blockade can be transmitted to another domain name via CNAME. So I did a test to prove it.
'www.youtube.com' sets a CNAME to 'youtube-ui.l.google.com'.
I query 'youtube-ui.l.google.com' on some well-known public domain name servers in mainland China and the result is non-polluted. 'www.youtube.com' is also non-polluted. 'youtube.com' is polluted which it didn't set CNAME. 'www.youtube.com' and 'youtube.com' is polluted if they query on foreign server.
It seems that my idea is not corrent, although it can be explained that they are not the same root domain name, and this is very common in CDN. If my idea are confirmed to be wrong, these blockades are nothing more than a reservation on the official paper.

Maybe it's not your problem. It is just purely coincidental.

Change 507321 had a related patch set uploaded (by BBlack; owner: BBlack):
[operations/dns@master] Revert "Add CNAME-variant langlist template"

https://gerrit.wikimedia.org/r/507321

Change 507321 merged by BBlack:
[operations/dns@master] Revert "Add CNAME-variant langlist template"

https://gerrit.wikimedia.org/r/507321

Change 507399 had a related patch set uploaded (by BBlack; owner: BBlack):
[operations/dns@master] Convert most DYNA into CNAME to wikipedia.org

https://gerrit.wikimedia.org/r/507399

Change 507400 had a related patch set uploaded (by BBlack; owner: BBlack):
[operations/dns@master] Change CNAME->DYNA TTLs from 1H to 1D

https://gerrit.wikimedia.org/r/507400

The current iteration of the proposed broadly-applied production version is in PS3 of the patch @ https://gerrit.wikimedia.org/r/c/operations/dns/+/507399/3 (and then a followup switch from 1H to 1D CNAME TTLs to go out shortly afterwards), will likely shoot for deployment early next week.

This version uses www.wikipedia.org as the focal point for the CNAMEs, and excludes the cases that can't be converted (some minor domainnames with email metadata records and the zone roots).

I was originally targetting the wikipedia.org zone root because it's shorter and also a common lookup, and the zone root can't be CNAME-converted anyways, which saves us one more significant case that lies outside the optimization.

However, after taking some sample data on address lookup prevalence (recorded in P8465), and considering the tricky issues around the zone root's metadata*, it seems like www is the better option (it's 8% vs the zone root's 3% in the lookup samples). Using an invented name (e.g. text-lb, or something shorter than doesn't conflict with langs/wikis) is an option too, but using a natural real lookup as the target seems more efficient vs an invented and otherwise-useless one.

* - TL;DR - we would have to account for the hopefully null effect of, and remember long into the future, that any futher names mapped to that schema share the wikipedia.org zone root's various MX/TXT/CAA records, when originally most of these names had no such metadata

Change 507399 merged by BBlack:
[operations/dns@master] Convert most DYNA into 1H CNAME records

https://gerrit.wikimedia.org/r/507399

Can stop your hand? login.wikimedia.org is a CNAME of www.wikipedia.org and the wikipedia.org domain is polluted.

Hi there, thanks for your work but it demonstrates unexpected issues in mainland China.

The *.wikipedia.org domain is poisoned inside mainland China, but not others. CNAMEing any domains to www.wikipedia.org makes all mainland China users receives poisoned answers for these domains when the recursive resolver is inside mainland China.

Change 509055 had a related patch set uploaded (by BBlack; owner: BBlack):
[operations/dns@master] Create dyna.wikimedia.org for text-addrs target

https://gerrit.wikimedia.org/r/509055

Change 509056 had a related patch set uploaded (by BBlack; owner: BBlack):
[operations/dns@master] Switch CNAME->DYNA scheme to dyna.wikimedia.org

https://gerrit.wikimedia.org/r/509056

Change 509057 had a related patch set uploaded (by BBlack; owner: BBlack):
[operations/dns@master] Undo "www.wikipedia.org" direct DYNA

https://gerrit.wikimedia.org/r/509057

Change 509055 merged by BBlack:
[operations/dns@master] Create dyna.wikimedia.org for text-addrs target

https://gerrit.wikimedia.org/r/509055

Change 509056 merged by BBlack:
[operations/dns@master] Switch CNAME->DYNA scheme to dyna.wikimedia.org

https://gerrit.wikimedia.org/r/509056

@Cwek @lilydjwg - Thanks for the reports! I apologize, this time around the fallout should've been predictable, given what we know from https://ooni.io/post/2019-china-wikipedia-blocking/ about the mechanisms, we just didn't think it through. I've pushed some changes above to move the CNAME target over to a new hostname dyna.wikimedia.org, which should fix things assuming CN's censorship tactics remain otherwise-stable. It will take up to roughly an hour for global DNS caches to catch up with the change and then we can continue investigations from there.

Our analytics seems to indicate the changes above had the intended effect in restoring normal levels of traffic from CN for affected projects:

Untitled drawing.jpg (720×960 px, 30 KB)

Change 509057 merged by BBlack:
[operations/dns@master] Undo "www.wikipedia.org" direct DYNA

https://gerrit.wikimedia.org/r/509057

Our analytics seems to indicate the changes above had the intended effect in restoring normal levels of traffic from CN for affected projects:

@BBlack maybe a short blog post would be helpful to clarify the situation? I've seen a handful of news articles saying China now blocks all language edition of Wikipedia (all the ones I've seen cite this article) but from what I can tell from reading the comments above, this isn't the case.

@kostajh - The OONI article you linked ( https://ooni.torproject.org/post/2019-china-wikipedia-blocking/ ) is accurate, and it's outside of our scope (more with our Legal and Communications teams at a high level) to communicate publicly and officially on that situation, if at all (they are aware!).

There are two interactions between that situation and the low-level technical work in this ticket (which wasn't intended to be related), which make things confusing if you're following this ticket:

  • The initial switch of censorship policy on CN's side happened to occur while we were conducting a DNS experiment in this ticket earlier on (mid - late April). There was concern/fear that our experiment was accidentally causing the newly-broadened CN censorship, but this turned out to not be the case. To be clear: the previous CN policy was just to block zh.wikipedia.org, and their new/current policy is to block *.wikipedia.org as well as zh.wikinews.org, but not to block any of our other projects/languages (e.g. en.wikiversity.org or zh.wikiquote.org). This is what the OONI article documents, and nothing we did here is actually involved in that change.
  • Later, when we implemented the production version of these technical changes around May 8 in T208263#5168164 , we actually did inadvertently cause all the rest of our domains to be blocked (even en.wikiversity.org and zh.wikiquote.org, and importantly login.wikimedia.org which all the projects use for authentication), because a technical detail of our change clashed with the technical details of how CN is doing their censorship. This was predictable and avoidable, but we failed to realize the impact until it was reported. The followup change in T208263#5170044 eliminated the clashing technical details, which eliminated the accidentally-expanded censorship, which is what the recovery graph in T208263#5170846 is showing: we recovered normal CN access levels to all the other unintentionally-censored domains that were caused by the technical clash, but CN continues to intentionally censor *.wikipedia.org and zh.wikinews.org.

Got it, thanks so much for this clarification, it's a very helpful summary of events.

Change 507400 merged by BBlack:
[operations/dns@master] Change CNAME->DYNA TTLs from 1H to 1D

https://gerrit.wikimedia.org/r/507400

BBlack claimed this task.

Scheme has been stable for ~1w now and seems to be working out fine. The net reduction in total authdns requests is ~32%. I suspect the drop in public requests for wiki hostnames is greater, as the total also includes all of our internal/infrastructure lookups as well, but either way we should be seeing far less DNS cache misses out there in the world, especially for longer-tail / less-popular project and language combinations.