Page MenuHomePhabricator

Revise the design of ResourceLoader's MessageBlobStore
Closed, ResolvedPublic

Description

ResourceLoader's MessageBlobStore suffers from the following defects:

It was not designed from the ground up to be a distributed system, so there are subtle race conditions that are difficult to isolate and fix.

  • It duplicates a lot of logic from MessageCache.
  • It uses timestamps to manage cache entries.

See also previous discussion and brain storming at T90001: Evaluate folding msg_resource and msg_resource_links tables into objectcache, which discusses one possible solution to this.

Plan:

  • Remove use of msg_resource_links table. Use runtime ResourceLoaderModule::getMessages() instead.
  • Remove use of msg_resource table. Convert MessageBlobStore to using BagOStuff instead of a dedicated table. Probably main cache / WANCache. Requires use of getMulti() for performance reasons.
  • Evaluate if we can phase out MessageBlobStore altogether. This will require MessageCache to grow an efficient getMulti() method. Note that MessageCache is only for MediaWiki-namespace overrides. The underlying LocalisationCache effectively has a getMulti() already by having all messages in a single storage key.

Related Objects

Event Timeline

ori created this task.Sep 18 2015, 8:31 PM
ori raised the priority of this task from to Needs Triage.
ori updated the task description. (Show Details)
ori added a project: MediaWiki-ResourceLoader.
ori added a subscriber: ori.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 18 2015, 8:31 PM
ori updated the task description. (Show Details)Sep 18 2015, 8:33 PM
ori set Security to None.
ori added a subscriber: Krinkle.
Krinkle triaged this task as High priority.Sep 18 2015, 8:46 PM
Krinkle moved this task from Inbox to Backlog on the MediaWiki-ResourceLoader board.
eranroz added a subscriber: eranroz.
Krinkle claimed this task.Oct 27 2015, 5:45 AM
Krinkle added a project: Performance-Team.
Krinkle moved this task from Inbox to Next In This Quarter on the Performance-Team board.

Change 251656 had a related patch set uploaded (by Krinkle):
[WIP] resourceloader: Remove msg_resource_links

https://gerrit.wikimedia.org/r/251656

Change 251660 had a related patch set uploaded (by Krinkle):
resourceloader: Remove obsolete msg_resource_links table

https://gerrit.wikimedia.org/r/251660

Change 251656 merged by jenkins-bot:
resourceloader: Remove use of msg_resource_links table

https://gerrit.wikimedia.org/r/251656

Change 252880 had a related patch set uploaded (by Krinkle):
[WIP] MessageBlobStore: Convert from msg_resource table to object cache

https://gerrit.wikimedia.org/r/252880

The MessageBlobStore store is in a good shape now, but I'd like to try folding it into MessageCache if it's not too much work

MessageCache (used by wfMessage) has an in-process cache that would allow easy preloading/cache warmup whilst still allowing the "real" callers to keep using wfMessage(). We just need ResourceLoader to know which message keys will be used and preload them (Logic for this already exists, currently fed into MessageBlobStore; Though right now it ends up doing a loop fetching one by one in case of cache miss).

If we can implement getMulti() in MessageCache, we can obsolete MessageBlobStore altogether. ResourceLoader's preloader would just warm up MessageCache (by passing message keys used by the modules) instead of warming up MessageBlobStore (by passing module names). The message keys used are already available at run-time in ResourceLoader at no cost.

Change 253656 had a related patch set uploaded (by Krinkle):
[WIP] Remove use of msg_resource table in favour of using MessageCache directly

https://gerrit.wikimedia.org/r/253656

Change 252880 abandoned by Krinkle:
[WIP] MessageBlobStore: Convert from msg_resource table to object cache

Reason:
Superseded by Id8c26f41a82597. Turns out we don't need caching for MessageBlobStore. MessageCache already has great cache layers, and it preloads in-process as well.

https://gerrit.wikimedia.org/r/252880

Reedy added a subscriber: Reedy.

As an aside... I added a script to purge the crap out of msg_resource to WikimediaMaintenance in https://gerrit.wikimedia.org/r/#/c/256756/ for the time being... It reduced the table by 25% on enwiki

Change 253656 merged by jenkins-bot:
resourceloader: Migrate from msg_resource table to object cache

https://gerrit.wikimedia.org/r/253656

Change 257966 had a related patch set uploaded (by Krinkle):
resourceloader: Migrate from msg_resource table to object cache

https://gerrit.wikimedia.org/r/257966

Before and after comparison of forceprofile from mw1017 using test2.wikipedia.org.

Startup request
https://test2.wikipedia.org/w/load.php?debug=false&lang=en&modules=startup&only=scripts&skin=vector&forceprofile=1&rand=before2

# Before
100.00% 935.733      1 - main()
 20.45% 191.397     12 - ResourceLoaderModule::getModuleContent
 18.73% 175.277    119 - DatabaseBase::select
 18.65% 174.474      2 - ResourceLoader::preloadModuleInfo
..
  1.21% 11.352     12 - MessageBlobStore::get
  1.21% 11.332     12 - MessageBlobStore::getBlobs
  1.20% 11.192     12 - MessageBlobStore::getFromDB

Total main() varied between 935 and 980ms.

https://test2.wikipedia.org/w/load.php?debug=false&lang=en&modules=startup&only=scripts&skin=vector&forceprofile=1&rand=after2

# After
100.00% 940.811      1 - main()
 19.34% 181.910      2 - ResourceLoader::preloadModuleInfo
 19.03% 179.074     12 - ResourceLoaderModule::getModuleContent
 16.56% 155.792    109 - DatabaseBase::select
..
  3.45% 32.429      2 - MessageBlobStore::getBlobs
  1.13% 10.659    221 - MessageBlobStore::makeCacheKey

Total main varied between 940ms and 990ms.

Module request
Using two modules that contain a few messages.
https://test2.wikipedia.org/w/load.php?debug=false&lang=en&modules=jquery.accessKeyLabel%7Cmediawiki.feedback&skin=vector&version=6a811252fbcc&forceprofile=1&before2

# Before
100.00% 53.051      1 - main()
 19.17% 10.170      1 - ResourceLoader::preloadModuleInfo
  5.38% 2.855      1 - ResourceLoader::makeModuleResponse
...
  2.32% 1.231      3 - MessageBlobStore::get
  2.31% 1.226      3 - MessageBlobStore::getBlobs
  2.25% 1.193      1 - MessageBlobStore::getFromDB

https://test2.wikipedia.org/w/load.php?debug=false&lang=en&modules=jquery.accessKeyLabel%7Cmediawiki.feedback&skin=vector&version=6a811252fbcc&forceprofile=1&after2

# After
100.00% 53.157      1 - main()
 21.79% 11.582      1 - ResourceLoader::preloadModuleInfo
  2.66% 1.415      1 - ResourceLoader::makeModuleResponse
...
  2.18% 1.157      1 - MessageBlobStore::getBlobs

No big wins, but then again the main motivation for this redesign wasn't a big gain in response time for regular load.php requests. Rather the main expected benefits are:

  • Don't require master DB connection for GET requests.
  • No contention on DB writes when modules have updated.
  • Multi-DC compatible.
  • Less exceptions in logs by being more reliable.
  • Less time spent queries data from MessageBlobStore (Memc faster than SQL select).
  • Use hashes instead of timestamps, produces more stable version hashes and more cache hits for users (T102578).
Krinkle closed this task as Resolved.Dec 9 2015, 8:50 PM
ori awarded a token.Dec 9 2015, 8:55 PM

Change 257966 merged by jenkins-bot:
resourceloader: Migrate from msg_resource table to object cache

https://gerrit.wikimedia.org/r/257966

What should happen to all the msg_resource references? What do system administrators have to do? According to $wgMainWANCache false (default) means the cache defaults to $wgMainCacheType i.e. CACHE_NONE: so this means that by default MessageBlobStore will not use any cache at all, correct?

$ ack msg_resource
maintenance/tables.sql
1500:CREATE TABLE /*_*/msg_resource (
1510:CREATE UNIQUE INDEX /*i*/mr_resource_lang ON /*_*/msg_resource (mr_resource, mr_lang);
1513:CREATE TABLE /*_*/msg_resource_links (
1518:CREATE UNIQUE INDEX /*i*/mrl_message_resource ON /*_*/msg_resource_links (mrl_message, mrl_resource);

maintenance/oracle/tables.sql
662:CREATE TABLE &mw_prefix.msg_resource (
668:CREATE UNIQUE INDEX &mw_prefix.msg_resource_u01 ON &mw_prefix.msg_resource (mr_resource, mr_lang);
670:CREATE TABLE &mw_prefix.msg_resource_links (
674:CREATE UNIQUE INDEX &mw_prefix.msg_resource_links_u01 ON &mw_prefix.msg_resource_links (mrl_message, mrl_resource);

maintenance/oracle/archives/patch_16_17_schema_changes.sql
59:CREATE TABLE &mw_prefix.msg_resource_links (
63:CREATE UNIQUE INDEX &mw_prefix.msg_resource_links_u01 ON &mw_prefix.msg_resource_links (mrl_message, mrl_resource);
65:CREATE TABLE &mw_prefix.msg_resource (
71:CREATE UNIQUE INDEX &mw_prefix.msg_resource_u01 ON &mw_prefix.msg_resource (mr_resource, mr_lang);

maintenance/mssql/tables.sql
1228:CREATE TABLE /*_*/msg_resource (
1238:CREATE UNIQUE INDEX /*i*/mr_resource_lang ON /*_*/msg_resource (mr_resource, mr_lang);
1241:CREATE TABLE /*_*/msg_resource_links (
1246:CREATE UNIQUE INDEX /*i*/mrl_message_resource ON /*_*/msg_resource_links (mrl_message, mrl_resource);

maintenance/archives/patch-msg_resource.sql
2:CREATE TABLE /*_*/msg_resource (
12:CREATE UNIQUE INDEX /*i*/mr_resource_lang ON /*_*/msg_resource(mr_resource, mr_lang);
15:CREATE TABLE /*_*/msg_resource_links (
20:CREATE UNIQUE INDEX /*i*/mrl_message_resource ON /*_*/msg_resource_links (mrl_message, mrl_resource);

maintenance/postgres/tables.sql
693:CREATE TABLE msg_resource (
699:CREATE UNIQUE INDEX mr_resource_lang ON msg_resource (mr_resource, mr_lang);
701:CREATE TABLE msg_resource_links (
705:CREATE UNIQUE INDEX mrl_message_resource ON msg_resource_links (mrl_message, mrl_resource);

maintenance/postgres/archives/patch-msg_resource.sql
1:CREATE TABLE msg_resource (
8:CREATE UNIQUE INDEX mr_resource_lang ON msg_resource (mr_resource, mr_lang);

maintenance/postgres/archives/patch-msg_resource_links.sql
1:CREATE TABLE msg_resource_links (
6:CREATE UNIQUE INDEX mrl_message_resource ON msg_resource_links (mrl_message, mrl_resource);

tests/phpunit/data/db/sqlite/tables-1.18.sql
512:CREATE TABLE /*_*/msg_resource (
518:CREATE UNIQUE INDEX /*i*/mr_resource_lang ON /*_*/msg_resource (mr_resource, mr_lang);
519:CREATE TABLE /*_*/msg_resource_links (
523:CREATE UNIQUE INDEX /*i*/mrl_message_resource ON /*_*/msg_resource_links (mrl_message, mrl_resource);

tests/phpunit/data/db/sqlite/tables-1.17.sql
494:CREATE TABLE /*_*/msg_resource (
500:CREATE UNIQUE INDEX /*i*/mr_resource_lang ON /*_*/msg_resource (mr_resource, mr_lang);
501:CREATE TABLE /*_*/msg_resource_links (
505:CREATE UNIQUE INDEX /*i*/mrl_message_resource ON /*_*/msg_resource_links (mrl_message, mrl_resource);

tests/parser/parserTest.inc
967:                    'archive', 'user_groups', 'page_props', 'category', 'msg_resource', 'msg_resource_links'

includes/installer/SqliteUpdater.php
69:                     array( 'addTable', 'msg_resource', 'patch-msg_resource.sql' ),

includes/installer/PostgresUpdater.php
88:                     array( 'addTable', 'msg_resource', 'patch-msg_resource.sql' ),
89:                     array( 'addTable', 'msg_resource_links', 'patch-msg_resource_links.sql' ),

includes/installer/MysqlUpdater.php
186:                    array( 'addTable', 'msg_resource', 'patch-msg_resource.sql' ),
Krinkle added a comment.EditedDec 10 2015, 3:26 PM

@Nemo_bis: There is a follow-up patch to drop the msg_resource and msg_resource_links tables. Currently deprioritised since keeping the tables is harmless. But that will land before we cut 1.27.

As for default behaviour, we should really change wgMainCacheType to be CACHE_DB by default. I don't understand why it isn't. You can't responsibly run a public MediaWiki install for purposes other than debugging or development without caching of any kind. See also T115890.

As for default behaviour, we should really change wgMainCacheType to be CACHE_DB by default. I don't understand why it isn't.

I don't know either, presumably to avoid killing databases and because sqlite might be slower than no cache at all. But that's not the point, we can't release this change without knowing what the effect is *and* without a plan. Can you file a followup task to change the $wgMainCacheType, please, so that this is tracked?

As for default behaviour, we should really change wgMainCacheType to be CACHE_DB by default. I don't understand why it isn't.

That was -2'd by Tim: https://gerrit.wikimedia.org/r/#/c/117091/