Investigate dynomite for WANObjectCache support
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	aaron
	Feb 1 2017, 5:57 PM

Description

See https://github.com/Netflix/dynomite/ .

This may work as a simpler (to compile/configure) alternative to mcrouter. I need to see what kind of multi-DC support it has. The support mcrouter has is just best-effort sync operations (logged on failure per host on a file nothing uses itself) for all operations of a certain type. That should not be hard to match. DC prefix (as long as is doesn't show up in keys) routing or similar features in other systems could be supported by WAN cache if needed without much effort.

Related Objects

Mentioned In: T192370: Deploy mcrouter to production as a wancache backend
T97562: WANObjectCache relay daemon or mcrouter support
T151466: Performance Q2 2017/18 goal: Install and use mcrouter in deployment-prep

Event Timeline

aaron created this task.Feb 1 2017, 5:57 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 1 2017, 5:57 PM

aaron triaged this task as Medium priority.Feb 1 2017, 8:00 PM

aaron moved this task from Inbox, needs triage to Doing (old) on the Performance-Team board.

Krinkle mentioned this in T151466: Performance Q2 2017/18 goal: Install and use mcrouter in deployment-prep.Apr 5 2017, 7:37 PM

Krinkle added a project: Wikimedia-Multiple-active-datacenters.Apr 5 2017, 9:57 PM

Krinkle moved this task from Backlog to Next-up on the Wikimedia-Multiple-active-datacenters board.

Krinkle edited projects, added Multiple-active-datacenters archived; removed Wikimedia-Multiple-active-datacenters.May 3 2017, 7:37 PM

Krinkle edited projects, added Sustainability (MediaWiki-MultiDC); removed Multiple-active-datacenters archived.May 3 2017, 7:57 PM

elukey subscribed.May 4 2017, 12:56 PM

So, having looked at this for a while, I think it's doable, but less desirable than mcrouter. I don't really see it having any edge here (https://github.com/Netflix/dynomite/wiki/FAQ is not terribly convincing either).

mcrouter (pros):

Well maintained (solid commit rate on github)
Decently documented at this point (on github wiki)
Battle tested for performance/stability on a *very* high traffic website
Highly configurable. We can do things like a) only replicated purges, b) replicate purges to the target (hash) server or all servers in the pool for robustness, c) replicate everything, including cache-aside writes (CAS/ADD for WANCache), d) do warm-up logic, and d) do different things for different pools. I'd like for WAN cache to just replicate purges (SET/DELETE), but if we want something that replicates incr/add/cas we can do so. We can also have a BagOStuff class that replicates set() (actual SET in that case).
If we run it on the maintenance/hhvm hosts, it can replace twemproxy, keeping the stack simpler (when used with memcached)
We already have a wmf package

mcrouter (cons):

Code is complex with lots of dependencies (folly, wangle, ect..)
- OTOH: Already packaged and probably not a big deal as long as it is stable (which it appears to be given it's main user and load abilities)
Cannot talk directly to redis
- OTOH: We can always have cache server local twemproxy (or even dyomite technically) instances that act as a brocker to mcrouter by speaking memcached ASCII protocol and turning that into commands to the local redis server. This is not more complex than twemproxy => dynomite => redis. In any case, redis support is not even relevant to the WAN cache (memcached is good enough), but it could be useful for sessions perhaps.

dynomite (pros):

Natively talks to redis
Simpler codebase
Far less dependencies (libtool autoconf automake libssl-dev)

dynomite (cons):

Lower commit rate on github
Focused on redis (memcached isn't really used by Netflix, so support will be worse)
Not all basic redis commands work yet anyway (https://github.com/Netflix/dynomite/issues/49)
Worse documentation, even for basic things like PEM and entropy files (which segfault/error out the service unless configured properly). The conf/ dir in the repo provides some dummy PEM and tokens (which can be moved to /etc/ and configured for use via some undocumented settings in the YAML file)
No wmf package yet (not a huge deal though)
Runs locally on the cache server, so we'd still need temproxy

mcrouter speeds seem comparably to twemproxy on the labs "tin" host:

> aaron@deployment-tin:~$ mwscript eval.php enwiki
> 

> $cmr = ObjectCache::newFromParams( [ 'class' => 'MemcachedPeclBagOStuff', 'servers' => [ '127.0.0.1:11213' ], 'persistent' => false ] );

> $ctp = ObjectCache::getLocalClusterInstance();

> $fs = function ( $c ) { $bad = 0; $t = microtime(true); for ( $i=0; $i<5000; ++$i ) { $bad += (int)!$c->set( "key$i", 1, 60 );} var_dump( microtime(true) - $t, $bad ); }

> $fg = function ( $c ) { $bad = 0; $t = microtime(true); for ( $i=0; $i<5000; ++$i ) { $bad += (int)!$c->get( "key$i" ); } var_dump( microtime(true) - $t, $bad ); }

> echo "mcrouter (SET) [sec, failures]\n"; $fs($cmr); // mcrouter => memcached
mcrouter (SET) [sec, failures]
float(6.9841389656067)
int(0)

> echo "twemproxy (SET) [sec, failures]\n";$fs($ctp); // temproxy => memcached
twemproxy (SET) [sec, failures]
float(6.8755619525909)
int(0)

> echo "mcrouter (GET) [sec, failures]\n"; $fg($cmr); // mcrouter => memcached
mcrouter (GET) [sec, failures]
float(6.0539009571075)
int(0)

> echo "twemproxy (GET) [sec, failures]\n";$fg($ctp); // temproxy => memcached
twemproxy (GET) [sec, failures]
float(6.2721688747406)
int(0)

Above test used the following mcrouter config:

{
    "pools": {
        "main": {
            "servers": [ "10.68.23.25:11211", "10.68.23.49:11211" ]
        }
    },
    "route": {
        "type": "OperationSelectorRoute",
        "default_policy": "PoolRoute|main",
        "operation_policies": {
            "set": {
                "type": "AllFastestRoute",
                "children": [ "PoolRoute|main" ]
            },
            "delete": {
                "type": "AllFastestRoute",
                "children": [ "PoolRoute|main" ]
            }
        }
    }
}

We'd probably use AllFastestRoute or AllAsyncRoute. The former is useful for making it unlikely that a later get() on the same request gives the old value (the local PoolRoute server will generally respond before a remote (35ms) PoolRoute one), whereas the later might be slightly faster.

Basically mcrouter seems like it can do whatever we we'd want to do with dynomite and more, and is more "ready to go" at this point.

aaron mentioned this in T97562: WANObjectCache relay daemon or mcrouter support.Feb 27 2018, 11:28 PM

Joe mentioned this in T192370: Deploy mcrouter to production as a wancache backend.Apr 17 2018, 3:15 PM

Investigate dynomite for WANObjectCache supportClosed, ResolvedPublicActions

Description

Related Objects

Event Timeline

Investigate dynomite for WANObjectCache support
Closed, ResolvedPublic
Actions