Storage solution for cross-datacenter tokens
Closed, ResolvedPublic
Actions

Description

Various extensions store tokens that can be set and claimed from different data centers.

Extensions include CentralAuth, OAuth, and ConfirmEdit.

I'd suggest using mcrouter or redis+envoy (see T277183 for redis/envoy plans to replace nutcracker). Essentially, token reads/writes would just go to the master DC via envoy/mcrouter prefix routes (using $wmfActiveDatacenter).

Details

Subject	Repo	Branch	Lines +/-
[MultiDC] Switch $wgCentralAuthTokenCacheType to mcrouter-primary-dc	operations/mediawiki-config	master	+1 -1
Move $wgCentralAuthTokenCacheType from redis_local to mcrouter	operations/mediawiki-config	master	+4 -0
mc.php: Add "mcrouter-primary-dc" to $wgObjectCaches	operations/mediawiki-config	master	+6 -1

Customize query in gerrit

Related Objects
Search...

Status	Subtype	Assigned	Task
Resolved		aaron	T88445 MediaWiki active/active datacenter investigation and work (tracking)
Resolved		Krinkle	T270223 FY2021-2022: Enable basic Multi-DC operations for read traffic (tracking)
Resolved		tstarling	T267270 Determine multi-dc strategy for CentralAuth
Resolved		aaron	T278392 Storage solution for cross-datacenter tokens
Resolved	PRODUCTION ERROR	tstarling	T311590 ApiEchoUnreadNotificationPages::getUnreadNotificationPagesFromForeign: Unexpected API response from {wiki}

Event Timeline

aaron created this task.Mar 24 2021, 10:55 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 24 2021, 10:55 PM

TK-999 subscribed.Mar 25 2021, 2:05 PM

• Gilles assigned this task to aaron.Mar 30 2021, 6:40 PM

• Gilles moved this task from Inbox, needs triage to Doing: Goals on the Performance-Team board.

Krinkle added projects: MediaWiki-extensions-CentralAuth, ConfirmEdit (CAPTCHA extension), MediaWiki-extensions-OAuth.Mar 30 2021, 6:41 PM

Krinkle added a parent task: T267270: Determine multi-dc strategy for CentralAuth.

Change 683022 had a related patch set uploaded (by Aaron Schulz; author: Aaron Schulz):

[operations/mediawiki-config@master] Add "mcrouter-master-dc" to $wgObjectCaches

https://gerrit.wikimedia.org/r/683022

gerritbot added a project: Patch-For-Review.Apr 27 2021, 5:55 PM

Note that the backing store can be moved again later on, making it easy to use mcrouter first.

Change 683465 had a related patch set uploaded (by Aaron Schulz; author: Aaron Schulz):

[operations/mediawiki-config@master] Set $wgCentralAuthTokenCacheType to mcrouter-master-dc

https://gerrit.wikimedia.org/r/683465

aaron moved this task from Doing: Goals to To-do: Goals prioritized current Quarter on the Performance-Team board.Aug 9 2021, 6:56 PM

aaron triaged this task as Medium priority.Jan 7 2022, 1:43 AM

For future reference: I was uncertain as to whether tokens are likely to remain for 1-minute in mcrouter given they are single use, received no demand, and my general impression that our memcached cluster is by default and by design always under eviction pressure (that is, we knowingly store more and for longer, than we know can fit).

So, I wrote a little script to try to emperically verify this.

P29911 krinkle-tmp.php

1	<?php
2	/*
3	[01:04 UTC] krinkle at mwmaint1002.eqiad.wmnet in ~
4	$ mwscript eval.php --wiki aawiki
5	> require '/home/krinkle/krinkle-tmp.php';
6	Done setting 6806 keys
7	Sleeping for 45 seconds...
8	Checking...
9	Checked 6806 keys.
10	Done!
11	*/
12
13	use MediaWiki\MediaWikiServices;
14	class KrinkleTmp {
15	const TIME_STORED = 60;
16	const TIME_SETTING = 5;
17	const TIME_WAIT = 45;
18	const MAX_KEYS = 10000;
19
20	public $stored = [];
21	public $wanCache;
22
23	public function __construct() {
24	$this->wanCache = MediaWikiServices::getInstance()->getMainWANObjectCache();
25	}
26
27	public function add(int $i): void {
28	$key = $this->wanCache->makeGlobalKey('krinkle-tmp', 'num' . $i);
29	$val = 'some_interesting_data_here' . $i;
30	$this->stored[] = [
31	'key' => $key,
32	'val' => $val,
33	'res' => $this->wanCache->set($key, $val, self::TIME_STORED),
34	'time' => microtime(true),
35	];
36	}
37
38	public function check(array $entry): void {
39	$actual = $this->wanCache->get($entry['key']);
40	if ($actual !== $entry['val']) {
41	$now = microtime(true);
42	$this->report($entry, $now, $actual);
43	}
44	}
45
46	public function report(array $entry, $now, $actual): void {
47	print sprintf("Key %s was %s after %s seconds (res: %s)\n",
48	$entry['key'],
49	($actual === false ? 'missing' : json_encode($actual)),
50	round($now - $entry['time']),
51	json_encode(entry['res'])
52	);
53	}
54
55	public function execute(): void {
56	$t1 = microtime(true);
57	$i = 0;
58	while (
59	(microtime(true) - $t1) < self::TIME_SETTING &&
60	$i < self::MAX_KEYS
61	) {
62	$this->add($i);
63	$i++;
64	}
65
66	print "Done setting $i keys\n";
67	print "Sleeping for " . self::TIME_WAIT . " seconds...\n";
68	$this->wanCache->clearProcessCache();
69	sleep(self::TIME_WAIT);
70
71	print "Checking...\n";
72	$i = 0;
73	foreach ($this->stored as $entry) {
74	$this->check($entry);
75	$i++;
76	}
77	print "Checked $i keys.\n";
78	print "Done!\n";
79	}
80	}
81
82	$tmp = new KrinkleTmp();
83	$tmp->execute();

The script stores several thousand tokens stored within a 5 second window, waits 45 seconds, and then checks that they're all there. I ran it several times, and never any loss.

It's not solid evidence, but at least anecdotally we know it can hold up without loss. My guess is that 1) we're not under as much pressure as I thought so probably our regular evictions are more towards the tail end e.g. cutting short TTLs from N days to N hours if unused, but we're not pushing out things stored less than a minute ago generally; and 2) Memcached's LRU eviction logic is quite good at prioritising to first evict older unused items before evicting new ones.

It doesn't feel great long-term, and I think this might bite us in terms of how gutter pools are used, and more generally that we don't operationally consider a partial failure or packet loss on Memcached as causing hard failures for edits or logins. But for the relatively small amount of data that CentralAuth needs here, I guess it's good enough for the initial transition. We could migrate it to a small dedicated mcrouter cluster at some point. Perhaps the same one that we'll replace the dc-local Redis with. That cluster would be for dc-local data, small in size, with (generally) no eviction happening, so we could monitor evictions like we do for Redis and consider non-zero evictions as something going wrong.

Background info:
https://github.com/memcached/memcached/blob/1.6.15/doc/new_lru.txt#L10-L24
https://github.com/memcached/memcached/wiki/UserInternals#when-are-items-evicted

Change 683022 merged by jenkins-bot:

[operations/mediawiki-config@master] mc.php: Add "mcrouter-primary-dc" to $wgObjectCaches

https://gerrit.wikimedia.org/r/683022

Mentioned in SAL (#wikimedia-operations) [2022-06-22T01:13:38Z] <tstarling@deploy1002> Synchronized wmf-config/mc.php: g 807158 T278392 (duration: 03m 35s)

Benchmark of cross-DC memcached shows 1 RTT (33ms) latency for get and set, but 2 RTT for incr:

[0124][tstarling@mwmaint2002:~]$ mwscript mctest.php --wiki=enwiki --i 100 --cache=mcrouter-primary-dc
Warming up connections to cache servers...done
Single and batched operation profiling/test results:
127.0.0.1:11213
 add: 100/100 3300ms   set: 100/100 3300ms   get: 100/100 (3297ms)   delete: 100/100 (3294ms)	incr: 100/100 (6589ms)
 setMulti (IB): ✓ 3296ms   getMulti (IB): 100/100 37ms   changeTTLMulti (IB): ✓ 3298ms   deleteMulti (IB): ✓ 3295ms
 setMulti (DB): ✓ 154ms   getMulti (DB): 100/100 155ms   changeTTLMulti (DB): ✓ 3299ms   deleteMulti (DB): ✓ 34ms

Further testing showed that for a non-existent key (as used by mctest.php), mcrouter maps incr to incr+add. Incrementing a key that exists only requires 1 RTT.

Change 809326 had a related patch set uploaded (by Krinkle; author: Aaron Schulz):

[operations/mediawiki-config@master] [MultiDC] Switch $wgCentralAuthTokenCacheType to mcrouter-primary-dc

https://gerrit.wikimedia.org/r/809326

Change 683465 merged by jenkins-bot:

[operations/mediawiki-config@master] Move $wgCentralAuthTokenCacheType from redis_local to mcrouter

https://gerrit.wikimedia.org/r/683465

Mentioned in SAL (#wikimedia-operations) [2022-06-29T04:37:25Z] <tstarling@deploy1002> Synchronized wmf-config/InitialiseSettings.php: wgCentralAuthTokenCacheType -> mcrouter T278392 (duration: 03m 44s)

taavi added a subtask: T311590: ApiEchoUnreadNotificationPages::getUnreadNotificationPagesFromForeign: Unexpected API response from {wiki}.Jun 29 2022, 1:30 PM

Change 809326 merged by jenkins-bot:

[operations/mediawiki-config@master] [MultiDC] Switch $wgCentralAuthTokenCacheType to mcrouter-primary-dc

https://gerrit.wikimedia.org/r/809326

Maintenance_bot removed a project: Patch-For-Review.Jul 8 2022, 6:30 AM

I tested cross-DC failover handling.

I used a loop of incrWithInit on mwmaint2002:

$c = ObjectCache::getInstance('mcrouter-primary-dc');
$c->set('test',0);
while ( true ) {
    printf("%-18f %s\n", microtime(true), $c->incrWithInit('test', 86400));
    sleep(1);
}

I dropped outbound TLS traffic but allowed unencrypted traffic as is used within the same DC:

iptables -v -A OUTPUT -p tcp --dport 11214 -j DROP

I left it like that for about 3 minutes, then deleted the rule.

As soon as the rule was applied, incrWithInit() started returning false. I somehow missed the fact that cross-DC routes, e.g. /eqiad/mw/ on a codfw server, do not use FailoverWithExptimeRoute, they are directly routed to the remote pool. There's no gutter pool.

After I deleted the iptables rule, it took another 30 seconds before it reconnected. The mcrouter log showed connection attempts every 60-90 seconds. That seems like a long time. So I investigated that and found that --probe-timeout-initial was raised from 3s to 60s for T255511. The rationale was that it's fine to use the gutter pool for a while. Unfortunately this is a global configuration variable, it can't be tuned down for routes that don't have a gutter pool.

tstarling closed this task as Resolved.Jul 21 2022, 12:15 AM

tstarling closed subtask T311590: ApiEchoUnreadNotificationPages::getUnreadNotificationPagesFromForeign: Unexpected API response from {wiki} as Resolved.Jul 21 2022, 12:28 AM

Krinkle mentioned this in T314434: Avoid ChronologyProtector queries on majory of pageviews that have no recent positions.Aug 2 2022, 8:16 PM

Krinkle mentioned this in T342201: MediaWiki\Extension\Notifications\Api\ApiEchoUnreadNotificationPages::getUnreadNotificationPagesFromForeign: Unexpected API response from {wiki}.Sep 7 2023, 8:59 PM

Storage solution for cross-datacenter tokensClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Storage solution for cross-datacenter tokens
Closed, ResolvedPublic
Actions

Related Objects
Search...