Page MenuHomePhabricator

Storage solution for cross-datacenter tokens
Open, MediumPublic

Description

Various extensions store tokens that can be set and claimed from different data centers.

Extensions include CentralAuth, OAuth, and ConfirmEdit.

I'd suggest using mcrouter or redis+envoy (see T277183 for redis/envoy plans to replace nutcracker). Essentially, token reads/writes would just go to the master DC via envoy/mcrouter prefix routes (using $wmfActiveDatacenter).

Related Objects

StatusSubtypeAssignedTask
ResolvedRLazarus
DuplicateRLazarus
Resolvedaaron
OpenNone
OpenNone
OpenKrinkle
Openaaron
Resolvedaaron
Resolvedtstarling
Declinedaaron
Resolvedaaron
ResolvedEevans
Resolvedaaron
ResolvedKrinkle
ResolvedPapaul
Opentstarling
Openaaron
ResolvedMarostegui
Resolvedaaron
ResolvedKrinkle
Resolvedtstarling

Event Timeline

Change 683022 had a related patch set uploaded (by Aaron Schulz; author: Aaron Schulz):

[operations/mediawiki-config@master] Add "mcrouter-master-dc" to $wgObjectCaches

https://gerrit.wikimedia.org/r/683022

Note that the backing store can be moved again later on, making it easy to use mcrouter first.

Change 683465 had a related patch set uploaded (by Aaron Schulz; author: Aaron Schulz):

[operations/mediawiki-config@master] Set $wgCentralAuthTokenCacheType to mcrouter-master-dc

https://gerrit.wikimedia.org/r/683465

aaron triaged this task as Medium priority.Jan 7 2022, 1:43 AM

For future reference: I was uncertain as to whether tokens are likely to remain for 1-minute in mcrouter given they are single use, received no demand, and my general impression that our memcached cluster is by default and by design always under eviction pressure (that is, we knowingly store more and for longer, than we know can fit).

So, I wrote a little script to try to emperically verify this.

1<?php
2/*
3[01:04 UTC] krinkle at mwmaint1002.eqiad.wmnet in ~
4$ mwscript eval.php --wiki aawiki
5> require '/home/krinkle/krinkle-tmp.php';
6Done setting 6806 keys
7Sleeping for 45 seconds...
8Checking...
9Checked 6806 keys.
10Done!
11*/
12
13use MediaWiki\MediaWikiServices;
14class KrinkleTmp {
15 const TIME_STORED = 60;
16 const TIME_SETTING = 5;
17 const TIME_WAIT = 45;
18 const MAX_KEYS = 10000;
19
20 public $stored = [];
21 public $wanCache;
22
23 public function __construct() {
24 $this->wanCache = MediaWikiServices::getInstance()->getMainWANObjectCache();
25 }
26
27 public function add(int $i): void {
28 $key = $this->wanCache->makeGlobalKey('krinkle-tmp', 'num' . $i);
29 $val = 'some_interesting_data_here' . $i;
30 $this->stored[] = [
31 'key' => $key,
32 'val' => $val,
33 'res' => $this->wanCache->set($key, $val, self::TIME_STORED),
34 'time' => microtime(true),
35 ];
36 }
37
38 public function check(array $entry): void {
39 $actual = $this->wanCache->get($entry['key']);
40 if ($actual !== $entry['val']) {
41 $now = microtime(true);
42 $this->report($entry, $now, $actual);
43 }
44 }
45
46 public function report(array $entry, $now, $actual): void {
47 print sprintf("Key %s was %s after %s seconds (res: %s)\n",
48 $entry['key'],
49 ($actual === false ? 'missing' : json_encode($actual)),
50 round($now - $entry['time']),
51 json_encode(entry['res'])
52 );
53 }
54
55 public function execute(): void {
56 $t1 = microtime(true);
57 $i = 0;
58 while (
59 (microtime(true) - $t1) < self::TIME_SETTING &&
60 $i < self::MAX_KEYS
61 ) {
62 $this->add($i);
63 $i++;
64 }
65
66 print "Done setting $i keys\n";
67 print "Sleeping for " . self::TIME_WAIT . " seconds...\n";
68 $this->wanCache->clearProcessCache();
69 sleep(self::TIME_WAIT);
70
71 print "Checking...\n";
72 $i = 0;
73 foreach ($this->stored as $entry) {
74 $this->check($entry);
75 $i++;
76 }
77 print "Checked $i keys.\n";
78 print "Done!\n";
79 }
80}
81
82$tmp = new KrinkleTmp();
83$tmp->execute();

The script stores several thousand tokens stored within a 5 second window, waits 45 seconds, and then checks that they're all there. I ran it several times, and never any loss.

It's not solid evidence, but at least anecdotally we know it can hold up without loss. My guess is that 1) we're not under as much pressure as I thought so probably our regular evictions are more towards the tail end e.g. cutting short TTLs from N days to N hours if unused, but we're not pushing out things stored less than a minute ago generally; and 2) Memcached's LRU eviction logic is quite good at prioritising to first evict older unused items before evicting new ones.

It doesn't feel great long-term, and I think this might bite us in terms of how gutter pools are used, and more generally that we don't operationally consider a partial failure or packet loss on Memcached as causing hard failures for edits or logins. But for the relatively small amount of data that CentralAuth needs here, I guess it's good enough for the initial transition. We could migrate it to a small dedicated mcrouter cluster at some point. Perhaps the same one that we'll replace the dc-local Redis with. That cluster would be for dc-local data, small in size, with (generally) no eviction happening, so we could monitor evictions like we do for Redis and consider non-zero evictions as something going wrong.

Background info:
https://github.com/memcached/memcached/blob/1.6.15/doc/new_lru.txt#L10-L24
https://github.com/memcached/memcached/wiki/UserInternals#when-are-items-evicted

Change 683022 merged by jenkins-bot:

[operations/mediawiki-config@master] mc.php: Add "mcrouter-primary-dc" to $wgObjectCaches

https://gerrit.wikimedia.org/r/683022

Mentioned in SAL (#wikimedia-operations) [2022-06-22T01:13:38Z] <tstarling@deploy1002> Synchronized wmf-config/mc.php: g 807158 T278392 (duration: 03m 35s)

Benchmark of cross-DC memcached shows 1 RTT (33ms) latency for get and set, but 2 RTT for incr:

[0124][tstarling@mwmaint2002:~]$ mwscript mctest.php --wiki=enwiki --i 100 --cache=mcrouter-primary-dc
Warming up connections to cache servers...done
Single and batched operation profiling/test results:
127.0.0.1:11213
 add: 100/100 3300ms   set: 100/100 3300ms   get: 100/100 (3297ms)   delete: 100/100 (3294ms)	incr: 100/100 (6589ms)
 setMulti (IB): ✓ 3296ms   getMulti (IB): 100/100 37ms   changeTTLMulti (IB): ✓ 3298ms   deleteMulti (IB): ✓ 3295ms
 setMulti (DB): ✓ 154ms   getMulti (DB): 100/100 155ms   changeTTLMulti (DB): ✓ 3299ms   deleteMulti (DB): ✓ 34ms

Further testing showed that for a non-existent key (as used by mctest.php), mcrouter maps incr to incr+add. Incrementing a key that exists only requires 1 RTT.