Oh hi.
User Details
- User Since
- Dec 11 2018, 9:39 PM (251 w, 23 h)
- Availability
- Available
- IRC Nick
- sukhe
- LDAP User
- Unknown
- MediaWiki User
- SSingh (WMF) [ Global Accounts ]
Today
For posterity:
Thanks everyone for the discussion and feedback above! So it seems like two main points have come up above:
Hi @DennisJJackson: Thanks for the question. We do plan to work on ECH and enable it for our sites and have had some discussions internally. There is no timeline yet as such, for a variety of reasons, the limited browser support being one, though that has clearly changed over the past few weeks. There are some other considerations here as well such as the lack of server-side options for turning it on but we are hoping the DEfO project will provide the much needed support there for HAProxy, which is what we use for TLS termination.
Yesterday
We can and probably should have a backup static routes for each of ns[01] but it can be to a single host instead of all three.
Mon, Oct 2
ntp.anycast.wmnet exists and the VIP 10.3.0.2/32 is being announced from all DNS hosts. The next step is to merge https://gerrit.wikimedia.org/r/961812 and attempt a reimage and ensure that it is fine.
Wed, Sep 27
Thanks for the task! Do we plan to consolidate https://netbox.wikimedia.org/ipam/prefixes/97/ip-addresses/? That is, if we remove the redundant backup static routes, 10.3.0.1/32 can be recdns, 10.3.0.2/32 will be syslog.anycast.wm and then 10.3.0.3/32 will be ntp.anycast.wmnet. I am asking this in the context of T347054 and the allocation of the IP for ntp.anycast.wmnet primarily.
For route 10.3.0.0/30 above, next-hop 208.80.153.77 is actually the old authdns host, so we are clearly not keeping the static routes updated. We should either update all of them or simply remove them so that we no longer have stale records.
sukhe@re0.cr2-codfw# show routing-options static /* Anycast recdns - backup route */ route 10.3.0.0/30 { next-hop 208.80.153.77; readvertise; no-resolve; }
Looking at 10.3.0.0/24 in Netbox:
Thanks for the feedback everyone! I was waiting so that we can get most of the comments in before replying; responses inline:
Thu, Sep 21
Mon, Sep 18
Tue, Sep 12
Thanks @BCornwall, I think we can close this one as we have done some other reimages in eqsin and not observed this issue.
Thu, Sep 7
As per @BCornwall's comment above, it does seem like we have more issue with 9.2.1 that we should look into:
Wed, Sep 6
The above patch fixes the issue with CI passing. Once reviewed, we can merge and close this task.
Tue, Sep 5
https://meta.wikimedia.org/wiki/Wikimedia_DNS is a detailed introduction of the project, including an FAQ.
Not directly related to bookworm but observed on the dnsdist 1.8.0 upgrade (part of bookworm) that results in a broken latency_bucket metric for the Wikimedia DNS hosts. Reported upstream in https://github.com/PowerDNS/pdns/issues/11239#issuecomment-1707046069.
The hosts were not provisioned in esams but I am fixing that by provisioning them so the durum ones should go away. Thanks for the task update!
Sep 1 2023
I have also tried upgrading the NIC firmware (from 21.40.21 to 22.31.6)
Aug 31 2023
Aug 30 2023
Aug 24 2023
Aug 22 2023
@taavi has rolled this out so this should be resolving shortly. Thanks for filing the task.
Aug 16 2023
Aug 15 2023
Thanks @Michael for filing this task. Restarting zuul helped a bit but only for a short period of time, after which the failed CI issue was back again.
Aug 14 2023
The requested records have been updated. Thanks!
Aug 9 2023
In discussion with @cmooney, we will be revisiting this task again when Traffic does some other authdns-related work, so removing it from the Traffic-Icebox.
Aug 8 2023
Aug 3 2023
Aug 2 2023
Jul 31 2023
Jul 28 2023
Jul 26 2023
@thcipriani and @odimitrijevic/@Milimetric this requires your approval for the deployment and analytics-privatedata groups respectively.
Hi @NHillard-WMF: This will require approval from your manager. Also adding @odimitrijevic and @Milimetric for approval on the Analytics side.
The link you shared is correct and no SSH keys are required. We will wait for @NHillard-WMF to confirm if he requires more access, otherwise this ticket itself is sufficient for SRE.
Jul 25 2023
||/ Name Version Architecture Description +++-==============-===============-============-================================= ii pdns-recursor 4.8.4-1+wmf11u1 amd64 PowerDNS Recursor
Jul 21 2023
@Fabfur and I observed the same issue on trying to reimage lvs1016.
Jul 20 2023
pdns-recursor 4.8.4-1+wmf11u1 has been running in production on the following hosts for a while:
Jul 18 2023
Jul 11 2023
Affected hosts:
Jul 10 2023
Traffic has commissioned these boxes. Many thanks to dc-ops!
The hosts have been decomissioned and ready for the hardware part.