Page MenuHomePhabricator

Evaluate folding msg_resource and msg_resource_links tables into objectcache
Closed, DuplicatePublic

Description

Per AaronSchulz, it should be possible to phase out these tables and a prefixed set of rows in the objectcache tables instead.

We can't use the "main" or "anything" object cache directly (e.g. $wgMainCacheType wfGetMainCache() or wfGetCache(CACHE_ANYTHING)) because last I checked we need these rows to be treated as one cohesive unit that won't have individual rows disappear based on LRU or some other algorithm.

Conceptually each of these tables is like a big blob that one could store in one cache key. It's build out into a table to allow fast querying of individual pieces (presumably because loading it all into memory is unacceptable), as well as selective updating of multiple rows in one atomic write action.

We could use either CACHE_DB (forced to use SqlBagOStuff, which persists, in the main database by default like the old tables, doesn't get purged by anything other than the provided ttl). Or make it its own cache type (like message cache, parser cache, and session cache) that defaults to CACHE_DB but allows users aware of the required contract to use a different backend.


Related:
T28398: ResourceLoader preloads blob metadata, then does another query for blob contents

Event Timeline

Krinkle created this task.Feb 19 2015, 8:44 PM
Krinkle raised the priority of this task from to Needs Triage.
Krinkle updated the task description. (Show Details)
Krinkle added subscribers: Krinkle, Catrope, ori and 2 others.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 19 2015, 8:44 PM
aaron added a comment.Feb 19 2015, 8:47 PM

Namespacing of keys and mass delete/update can be done atomically using "touch keys" (key with last-touched that is checked against on get()) or "prefix keys" (key with the random prefix of a namespace that tells the prefix of the actual value keys for that namespace). This is a generic problem.

Aklapper triaged this task as Normal priority.Feb 20 2015, 10:30 AM
Krinkle updated the task description. (Show Details)Feb 26 2015, 7:59 PM
Krinkle set Security to None.
Krinkle claimed this task.Jun 9 2015, 1:40 AM
Krinkle added a project: Performance.
Krinkle added a project: Performance-Team.
Krinkle moved this task from Tag to Next-up on the Performance board.
Krinkle moved this task from Accepted: Enhancement to Assigned on the MediaWiki-ResourceLoader board.

@aaron and I looked over this today. Here's our notes.

We've got two tables:

  • msg_resource:
    • Stores JSON blobs containing key/value pairs of interface messages (fully resolved, contains both local db overrides and canonical values from the software).
    • Keyed by language, and module name.
    • Also tracks timestamps.
  • msg_resource_links
    • Stores message keys and module names that are associated with one another. Is queried in both dimensions (by message key when fetching rows, and by module when replacing rows).

These tables are accessed via the MessageBlobStore class.

When building the load.php "startup" module, ResourceLoader collects version hashes of all modules. The versions are computed based on the module scripts and styles content (or their meta data, see T98087). While messages must be included in this (so that changes invalidate the module cache), the way ResourceLoader fetches the message content is behind its own caching layer. The MessageBlobStore.

MediaWiki also has LocalisationCache, for the canonical translations from the software. This cache does not currently support retrieving multiple messages at once. Stored in a CDB file.

MediaWiki also has MessageCache, for message overrides from the local wiki. This cache also doesn't support multi-key retrieval. Stored in MySQL.

In order to detect changes and refresh the message blobs for ResourceLoader, MessageBlobStore currently hooks into MessageCache::replace(). Then it makes use of msg_resource_link to find which modules use the message (as iterating over all modules in PHP was presumably too slow). Then, with those module names, it fetches the blobs from msg_resource, decodes it, replaces the updated key, and inserts the blob back in the database.

The "startup" module, knowing the module names and message keys, calls ResourceLoader::preloadModuleInfo() which makes a single DB query to msg_resource to retrieve all relevant message blobs at once.

Aaron and I considered removing the MessageBlobStore entirely instead of replacing it with something outside the database. This because we already have two different caches for localisation. However the benefits of querying it at once do seem significant. Having to make 3034 calls to wfMessage() in a user-facing "startup" HTTP request (run every 5 minutes) seems a significant overhead that warrants caching.

Data points

en.wikipedia.org has 1232 modules. They use a total of 3034 different messages.

Next steps

1: Get rid of the msg_resource_links table. We don't need to replace this with something else. Iterating over the registered modules and calling getMessages() doesn't have much overhead. Besides, this only happens in POST requests when the MediaWiki-namespace pages are edited on the wiki.

2: Drop msg_resource table.
Verify that making 3000+ separate wfMessage()->plain() calls in the startup HTTP request is too slow. If this is not too slow, we can drop MessageBlobStore and go straight to content hashing for messages. If this is too slow (which I suspect is the case) then we should look into ways to fetch multiple keys at once, like the current system does.

It looks like CDB doesn't have a built-in way to load the entire array into memory. And while "all JS messages" is a lot, it's still significantly less than "all messages in JS and PHP". So it'd be nice to avoid loading the entire thing. We can implement a getMulti in the CDB\Reader class though. It'd do a single nextkey pass from start to finish and look for a known set of keys (getting smaller as it finds more keys). We can expose this in the MessageCache for preloading purposes (best efforts) that would warm up any in-process caches.

Gains

  • Drop two database tables and one class. Reduces maintenance overhead and complexity.
  • No more (slave) database reads for module message keys. Will be using cheap functional logic in PHP instead.
  • No more (master) database writes on GET requests to update stale message keys lists.
  • No more cross-datacenter replication for a cache (msg_resource[_links]) that should remain local to a singe data centre.

Data points

en.wikipedia.org has 1232 modules. They use a total of 3034 different messages.

Holy crap. I didn't believe this, but it looks like there really are that many modules:

>>> Object.keys(mw.loader.moduleRegistry).length
1196

Next steps

1: Get rid of the msg_resource_links table. We don't need to replace this with something else. Iterating over the registered modules and calling getMessages() doesn't have much overhead. Besides, this only happens in POST requests when the MediaWiki-namespace pages are edited on the wiki.

Sounds reasonable.

2: Drop msg_resource table.
Verify that making 3000+ separate wfMessage()->plain() calls in the startup HTTP request is too slow. If this is not too slow, we can drop MessageBlobStore and go straight to content hashing for messages. If this is too slow (which I suspect is the case) then we should look into ways to fetch multiple keys at once, like the current system does.

Note that calling wfMessage()->plain() 3000+ times (are these 3034 messages unique messages, or were duplicates counted?) isn't actually (or even effectively) what the startup module does right now; it just queries the msg_resource table for the blobs' timestamps. The startup module doesn't necessarily need the content of the messages. Of course, if you were also planning on replacing timestamp-based invalidation for messages with hash-based invalidation, then yes, you would need the contents of every message.

It looks like CDB doesn't have a built-in way to load the entire array into memory. And while "all JS messages" is a lot, it's still significantly less than "all messages in JS and PHP". So it'd be nice to avoid loading the entire thing. We can implement a getMulti in the CDB\Reader class though. It'd do a single nextkey pass from start to finish and look for a known set of keys (getting smaller as it finds more keys). We can expose this in the MessageCache for preloading purposes (best efforts) that would warm up any in-process caches.

The message cache in CDB (LocalisationCache) is insufficient. As you noted, the message blob data is fully resolved and includes on-wiki customizations from the MediaWiki namespace. The message cache in CDB does not. I believe the message cache in memcached (MessageCache) does, though. But that's in memcached so you can't rely on it always being populated (although you can rely on it not being out of date), so for every message that's missing from memcached you have to query the DB to see if there's a MediaWiki:$key page (or a MediaWiki:$key/$langcode page!).

So preloading 3k messages from CDB may not actually help you, because you still need to hit memcached for every single one of them (ideally in a way that doesn't require thousands of memcached hits; doesn't a memcached request take about 1ms?), and then only for the ones that are missing from there do you need to hit CDB (and also MySQL). You could batch that, though: hit memcached first, make a list of which messages are missing, then use the readMulti thing you described with that list of keys (and also run a batch query against MySQL, or maybe split into a few queries if the number of misses is very high).

Storing messages by module (one memcached key per module) seems reasonable on the surface (it makes building module responses straightforward) but it doesn't really help reduce the number of memcached requests that much: if 3k memcached requests is too much, ~1200 is probably also too much.

Also, to make all of these problems worse, the message contents vary by language (that's kind of the point of having messages). So if you use hash-based invalidation you clearly have to compute your hash over different contents for every language; even with timestamp-based invalidation like we have now, you need different timestamps for different languages because of MediaWiki:$key/$langcode pages.

Don't get me wrong, I want the msg_resource tables to die in a fire, for all sorts of reasons. But if you multiply this stuff out, you get some pretty big numbers. Right now, we theoretically need to store one message blob (or at least know its timestamp for the startup module) for each of 1232 modules in each of 408 languages (that's ~500k blobs per wiki) on each of 889 wikis (~446 million total). Of course that's the theoretical maximum, but some of the bigger wikis do in fact have over 100k blobs in their msg_resource tables (Commons has 290k). Having to deal with multiple languages doesn't make anything slower (a single startup module run is only for a single language) but it does increase the amount of storage / cache space you need by quite a lot.

So I'm not quite sure how we would compute all these message versions/hashes in a reasonable amount of time. One thing that helps the msg_resource approach not be slow is that missing blobs/timestamps aren't backfilled by the startup module but only when the relevant module is requested, but that can cause invalidations to be missed when those tables are cleared.

I guess we could store the entire blob of all RL messages in one memcached key (3k messages is probably "only" about 200KB?) per language, have MessageCache::replace() keep that up to date, and figure out how to regenerate that without it being all too slow for when that key drops out of memcached (or when someone requests a language that hasn't been requested before on that wiki). That's a "when" not an "if", because the amount of data we'd be trying to cache is a quarter of a megabyte for at least one language per wiki, which is ~200MB, and that's being optimistic, I'd expect it to be more like half a gigabyte.

That's why I think it might be better to stick with timestamp-based versioning (even if that timestamp is then used as part of a hash computation), because then you don't have to store that much data. You could maintain a single memcached key that contains a big array of timestamps, one for each module, that you update from MessageCache::replace() (note that you don't have to segment this by language, and it's probably better not to). Then you add that timestamp to the module hash, as well as max(filemtime(i18n/*.json)) or something similar (maybe LocalisationCache::recache() could include the current timestamp when it writes to the cache so you know how old the current LocalisationCache data is). Then you'd still have to figure out what happens if you lose that memcached key, but you have the option of sacrificing accuracy for speed: repopulating a missing key with wfTimestampNow() is very fast but also invalidates everything; repopulating with the timestamp of the latest edit to the MediaWiki namespace is a bit better but still causes a lot of invalidation; actually recomputing all the timestamps is fully accurate but probably too slow. However, this array of timestamps is pretty small (~20KB) so I suppose you could store it in a more reliable place like CACHE_DB.

    1. Gains
  • Drop two database tables and one class. Reduces maintenance overhead and complexity.
  • No more (slave) database reads for module message keys. Will be using cheap functional logic in PHP instead.
  • No more (master) database writes on GET requests to update stale message keys lists.
  • No more cross-datacenter replication for a cache (msg_resource[_links]) that should remain local to a singe data centre.

+ we get to kill extensions/WikimediaMaintenance/clearMessageBlobs.php which I've always hated.

To clarify: the only problem I see is with replacing the invalidation function of msg_resource (making sure the module is invalidated when one of its messages changes). I don't think its caching function (storing a ready-to-go message blob in the database) is very important, because module responses are cached aggressively, so they're generated infrequently, and regenerating the message blob every time is probably not that slow (although that's an assumption worth validating).

Krinkle added a comment.EditedJun 29 2015, 8:03 PM

Note that the timestamps provided by MessageBlobStore are not "real". We don't have per-message timestamps. We have timestamps for the entire localisation cache (canonical), but that rolls over once a day. This is made worse by the nightly clearMessageBlobs.php run I imagine. I'm waiting for data to confirm this, but I suspect this means any module that contains messages get's invalidated cache – every single day for no good reason.

Per T102578, timestamps are inherently leaky and poisonous in terms of unwanted cache invalidation.

I wasn't specifically aware of the relation to this task, but we've actually started to bypass message timestamps for some modules. See https://gerrit.wikimedia.org/r/210942 and https://gerrit.wikimedia.org/r/215364 in which we compute versions for a subset of modules based on the module content.

Pending better instrumentation, I indent to enable "module content version" for all modules pretty soon. See https://gerrit.wikimedia.org/r/#/c/221052/ for the work in progress.

Note that the content version option is still backed by MessageBlobStore for messages. While the timestamps no longer influence the generated blob or version in any way, the MessageBlobStore itself still uses it to decide how and when to update itself.

Given we now use it purely as a cache, I'd like to phase it out per the previous comments.

2: Drop msg_resource table.
Verify that making 3000+ separate wfMessage()->plain() calls in the startup HTTP request is too slow. [..]

Note that calling wfMessage()->plain() 3000+ times [..] isn't actually (or even effectively) what the startup module does right now; it just queries the msg_resource table

I'm aware. That's why this point says to verify whether it would be too slow – before we drop the table.

And as you point out, there is also memcached and MySQL involved for a potentially large number of them. And the 1ms roundtrip for memcached won't scale.

[..] regenerating the message blob every time is probably not that slow (although that's an assumption worth validating).

Exactly. Except I assume it will be too slow and thus suggested we block this task on efficient readMulti support for all involved layers (including CDB). But, if 3000 fetches are fast enough in the startup module, we won't have that blocker.

Krinkle added a comment.EditedJun 29 2015, 8:18 PM

See https://phabricator.wikimedia.org/T98087#1412712 for initial results from experimenting with content versioning.

Krinkle removed Krinkle as the assignee of this task.Aug 6 2015, 6:10 AM
Krinkle lowered the priority of this task from Normal to Low.Sep 4 2015, 2:37 AM