Page MenuHomePhabricator

noc.wikimedia.org with X-Wikimedia-Debug routes to mwdebug but host is not served there
Closed, ResolvedPublic

Description

When a developer is debugging MediaWiki code in production, they cannot consult https://noc.wikimedia.org/. The domain is routed to mwdebug but not served there.

This is rather surprising/confusing. The workaround, once you realise that the problem is related to XWD being enabled, is to turn it off and on whenever you switch between the two browser tabs.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

This has regressed last month as well and was fixed shortly after. Presumably somehing went wrong in the ATS routing logic again.

T233768: Enable mwdebug routes for noc.wikimedia.org

jbond triaged this task as Low priority.Feb 19 2020, 1:44 PM
Krinkle updated the task description. (Show Details)
Krinkle updated the task description. (Show Details)

This has regressed last month as well and was fixed shortly after. Presumably somehing went wrong in the ATS routing logic again.

T233768: Enable mwdebug routes for noc.wikimedia.org

Nothing is wrong in the routing logic. ATS is sending requests with X-Wikimedia-Debug: mwdebug1001.eqiad.wmnet to mwdebug1001.eqiad.wmnet as requested in T233768.

Krinkle renamed this task from noc.wikimedia.org doesn't route to the docroot when WikimediaDebug browser extension is live to noc.wikimedia.org with X-Wikimedia-Debug routes to mwdebug but host is not served there.Apr 2 2020, 7:10 PM

Ah okay, so we're stuck between a rock and a hard place.

  • Before: We don't route XWD for NOC. This means NOC is usable while debugging MW in prod, but it means you can't debug NOC itself.
  • After: We route XWD for NOC to mwdebug, which doesn't serve that hostname. This means NOC is not accessible while debugging, and also isn't debuggable itself.

If serving NOC from mwdebug is easy to enable, that'd be great. Otherwise, we may want to revert that change for now. Sorry for the confusion, I had not realised that the NOC vhost is only provisioned on the mwmaint app servers.

As it stands, NOC is broken with XWD. We need to choose one of these two, I think:

  1. Let NOC ignore XWD, like we do for other misc services such as Gerrit, Phab, Grafana, etc.
  2. Provision some or all of mwmaint on mwdebug, and actually support NOC there.

I think option 1 is simplest, especially since making it work on mwdebug might be easy to get started but might be counter-intuitive long-term since AFAIK mwdebug is not meant to be a place for testing other mwmaint responsibilities.

I think this is just a mistake due to my confusion in T233768. Testing changes on NOC is easy since one can simply run scap pull on mwmaint1002, and if one is testing something that really can't be on production mwmaint, then one can also test NOC locally during development, which I've since made possible.

This is needed before we can release the new version of the WikimediaDebug extension that reads the server list from noc.wikimedia.org Otherwise as soon as you enable a debug server, you can no longer fetch the server list.

Change 663156 had a related patch set uploaded (by Gilles; owner: Gilles):
[operations/puppet@production] Don’t apply X-Wikimedia-Debug routing to noc.wikimedia.org

https://gerrit.wikimedia.org/r/663156

Change 663156 merged by Effie Mouzeli:
[operations/puppet@production] Don’t apply X-Wikimedia-Debug routing to noc.wikimedia.org

https://gerrit.wikimedia.org/r/663156

Gilles claimed this task.
Gilles added a subscriber: jijiki.

Thanks @jijiki !