Page MenuHomePhabricator

Enable "/*/mw-with-onhost-tier/" route for MediaWiki where safe
Open, MediumPublic

Description

As we had discussed previously, there are keys which we would like them to use the WarmUpRoute so to relieve some of the network traffic between our app servers and our memcached cluster.

SRE has created and deployed a mcrouter route called /*/mw-with-onhost-tier, which right now is configured exactly like the /*/mw route. The configuration which will route /*/mw-with-onhost-tier to use the WarmUpRoute can be enabled per MediaWiki server by setting profile::mediawiki::mcrouter_wancache::use_onhost_memcached to true.

We now need to have MediaWiki add the /*/mw-with-onhost-tier prefix to keys we want to be read from the onhost memcached, for instance the WANObjectCache keys. After this is deployed to MediaWiki, we can start gradually roll out to all clusters and test it in production.

Event Timeline

Gilles moved this task from Inbox to Doing (old) on the Performance-Team board.
aaron triaged this task as Medium priority.Oct 5 2020, 6:36 PM
Aklapper renamed this task from MediaWiki to route spefic keys to /*/mw-with-onhost-tier/ to MediaWiki to route specific keys to /*/mw-with-onhost-tier/.Oct 5 2020, 7:36 PM

Change 636094 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[operations/mediawiki-config@master] Add "mcrouter-with-onhost-tier" entry to $wgObjectCaches

https://gerrit.wikimedia.org/r/636094

Change 636095 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[operations/mediawiki-config@master] Switch parser cache to using "mcrouter-with-onhost-tier"

https://gerrit.wikimedia.org/r/636095

@aaron is there a timeline as to when those patches will be merged?

@aaron is there a timeline as to when those patches will be merged?

I was hoping it would get some CR. Is SRE OK with the patch? I assume a standard mwdebug => test => everything deploy will suffice. Does someone from SRE want to do the final sanity testing before the full scap sync? I could do some ParserCache testing in eval.php and by browsing around.

@aaron mwdebug1001 has the mcrouter configuration we want to roll out when we merge the mediawiki patches, so you are free to test it there. Meanwhile I will check with serviceops to get some CRs for your patches. Thank you!

@aaron We will merge your patches on Monday and enable onhost memcached on an API canary host :)

Change 636094 merged by jenkins-bot:
[operations/mediawiki-config@master] Add "mcrouter-with-onhost-tier" entry to $wgObjectCaches

https://gerrit.wikimedia.org/r/636094

Mentioned in SAL (#wikimedia-operations) [2020-11-10T12:23:52Z] <lucaswerkmeister-wmde@deploy1001> Synchronized wmf-config/mc.php: Config: [[gerrit:636094|Add "mcrouter-with-onhost-tier" entry to $wgObjectCaches (T264604)]] (duration: 00m 57s)

Change 636095 merged by jenkins-bot:
[operations/mediawiki-config@master] Switch parser cache to using "mcrouter-with-onhost-tier"

https://gerrit.wikimedia.org/r/636095

Mentioned in SAL (#wikimedia-operations) [2020-11-10T12:31:10Z] <lucaswerkmeister-wmde@deploy1001> Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:636095|Switch parser cache to using "mcrouter-with-onhost-tier" (T264604)]] (duration: 00m 57s)

@aaron @Krinkle We have successfully rolled this out on all app and api servers, and I am planning to continue with the jobrunners and parsoid.

Please let me know how would you like to continue this work and how I can help!

api:
memcached api

memcached slabs api

app servers:
memcached app

memcached app slabs

Thank you!

Can you point me to where (in Puppet?) the ttl enforcement and route/command filter for on-host reside? I'd like to link it from docs and maybe edit it to also link back to MW docs so that it doesn't get accidentally changed in ways that might (sublty) break compat.

@Krinkle would it help if I paste the generated config ?

The puppet part is here: mcrouter_wancache.pp#L114

Krinkle renamed this task from MediaWiki to route specific keys to /*/mw-with-onhost-tier/ to Enable "/*/mw-with-onhost-tier/" route for MediaWiki where safe.Nov 25 2020, 1:01 AM

Change 643384 had a related patch set uploaded (by Krinkle; owner: Aaron Schulz):
[operations/mediawiki-config@master] Make "mcrouter-with-onhost-tier" cache use "$wmfDatacenter"

https://gerrit.wikimedia.org/r/643384

Change 643384 merged by jenkins-bot:
[operations/mediawiki-config@master] Make "mcrouter-with-onhost-tier" cache use "$wmfDatacenter"

https://gerrit.wikimedia.org/r/643384

@Krinkle @aaron do you think we are ready to move this forward?

@Krinkle @aaron do you think we are ready to move this forward?

Beyond the parser cache? Anything else is probably blocked on https://phabricator.wikimedia.org/T252564 (for simplicity).

@aaron now that T252564 has been unblocked, after I finish with T273115, I think we should proceed with moving this task forward

Next steps:

  1. Update mediawiki/WANObjectCache to implement a new config option that lets you route "value" keys differently with a different mcrouter prefix.
  2. Update mediawiki/WANObjectCache to store tombstone stored as sister-key (otherwise they will get trapped in the on-host tier as values).

I'll start with the first one of these.

Hm.. so the above poses a bit of a paradox. Implementing the onHostRoutingPrefix option is dependent on tombstones having their own sister-key. Implementing tombstones as their own sister-key, assuming we want that to be a feature flag for efficiency reasons, seems natural to do based on the onHostRoutingPrefix option.

So... I guess we could do it as one change then, I'll go with that for now.

Change 672514 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] objectcache: Implement 'onhostRoutingPrefix' option in WANObjectCache

https://gerrit.wikimedia.org/r/672514

Change 672514 merged by jenkins-bot:

[mediawiki/core@master] objectcache: Implement 'onHostRoutingPrefix' option in WANObjectCache

https://gerrit.wikimedia.org/r/672514

Change 682698 had a related patch set uploaded (by Krinkle; author: Krinkle):

[mediawiki/core@REL1_36] objectcache: Implement 'onHostRoutingPrefix' option in WANObjectCache

https://gerrit.wikimedia.org/r/682698

Change 682698 merged by jenkins-bot:

[mediawiki/core@REL1_36] objectcache: Implement 'onHostRoutingPrefix' option in WANObjectCache

https://gerrit.wikimedia.org/r/682698