Page MenuHomePhabricator

Redesign ResourceLoader's file dependency tracking (module_deps)
Open, MediumPublic

Description

Status quo

Following T90001, we spawned T113092 for the msg_resource table. This task is for the module_deps table.

The population of the module_deps table is deterministic. It is currently stored in the main DB because we want high persistence, due to the high cost of regeneration.

The data is queried for thousands of modules at once from the "startup" module. Generating it all at once would be impossible within our desired HTTP response time (would take tens of seconds).

It's typically populated in a distributed fashion, e.g. from separate on-demand module requests for load.php.

We deal with absence by generating a temporary placeholder version hash. Then, after a user actually needed the module, and requests it with the temporary version hash, that request will do the in-depth computation and stores it in the database. From then on-wards, the "startup" module will contain the correct version hash.

This means that after a deployment, modules for which version hash computation is expensive, will first get invalidated to a temporary hash, and then invalidated again a few minutes later to the eventual one. This is a bit wasteful, but an intentional design decision for ResourceLoader. Improving or avoiding this aspect is outside the scope of this task.

Problem statement

Due to this data being stored in the main MySQL databases, it requires that load.php GET requests make DB-master write queries to change these rows. This is a performance and availability anti-pattern.

It has been mitigated to some extent:

  • Concurrent writes for the same thing thing are avoided via Memc locks (non-blocking).
  • These writes only happen from HTTP cache misses, and the responses have a long TTL, and the CDN cache is used and shared by both logged-in and logged-out users alike.

The objective is to store this data elsewhere, outside the databases. But ideally in a way that still upholds as much as possible the persistence.

Ideas

@Catrope and I had a brain-storm session last week (in context of T102578) and came up with the following known issues:

  • The table stores absolute paths which means when a wmf-branch roll over, it loses track of some files, thus causing a needless cache invalidation. Since old wmf branches are not immediately removed (in part because we have multiple versions in deployment at any one time), the old file paths are not obviously wrong. As such, the table can even end up including both old and new versions of the same file. This and more is tracked under T111481.
  • Lots of old data is left in module_deps from modules that no longer exist in recent versions of MediaWiki core and extensions, because there is no TTL and no garbage collection.

Also, since the values are deterministic, we do not need a store that is replicated across data centres. A dc-local store is sufficient.

Event Timeline

Krinkle created this task.Sep 27 2015, 11:25 PM
Krinkle raised the priority of this task from to Needs Triage.
Krinkle updated the task description. (Show Details)
Krinkle added subscribers: Krinkle, Catrope.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 27 2015, 11:25 PM
Krinkle triaged this task as Medium priority.Sep 30 2015, 12:14 AM
Krinkle set Security to None.
Krinkle moved this task from Inbox to Backlog: Small & Maintenance on the Performance-Team board.
Krinkle moved this task from Inbox to Backlog on the MediaWiki-ResourceLoader board.
Krinkle added a subscriber: ori.EditedNov 24 2015, 6:52 AM

Current aim is to continue to making module building perform faster with the intent of eventually enabling run-time "module content versioning" for all modules – in which case this tracking system becomes obsolete.

The recently-added caching layer for LESS complication (thanks to @ori) has brought us a much closer to making it possible to compute all 3000+ module's versions ad-hoc in the "startup" module. This progress makes me hopeful we'll be able to do this within a quarter or two.

Krinkle claimed this task.Dec 6 2016, 12:52 AM

(From Offsite) Using BagOStuff could work for this. It has an Sql subclass we can keep as default, but users can configure it to something else.

Krinkle removed Krinkle as the assignee of this task.Feb 8 2017, 6:08 PM
Krinkle updated the task description. (Show Details)Mar 3 2019, 4:10 PM
Krinkle assigned this task to aaron.Jun 10 2019, 1:45 PM

Change 519741 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[mediawiki/core@master] [WIP] resourceloader: move indirect module dependency path tracking to BagOStuff

https://gerrit.wikimedia.org/r/519741

Change 519746 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[mediawiki/core@master] objectcache: optmize lock() and unlock() for SqlBagOStuff and clean up base method

https://gerrit.wikimedia.org/r/519746

Change 519766 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[mediawiki/core@master] bagostuff: optimize SqlBagOStuff and fix failing segmentation tests

https://gerrit.wikimedia.org/r/519766

Change 520148 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[mediawiki/core@master] objectcache: clean up RedisBagOStuff and optimize changeTTLMulti()

https://gerrit.wikimedia.org/r/520148

Change 519746 merged by jenkins-bot:
[mediawiki/core@master] objectcache: optimize lock() and unlock() methods in SqlBagOStuff

https://gerrit.wikimedia.org/r/519746

Change 519766 merged by jenkins-bot:
[mediawiki/core@master] bagostuff: optimize SqlBagOStuff and fix failing segmentation tests

https://gerrit.wikimedia.org/r/519766

Change 520148 merged by jenkins-bot:
[mediawiki/core@master] objectcache: clean up RedisBagOStuff and optimize changeTTLMulti()

https://gerrit.wikimedia.org/r/520148

Change 519741 merged by jenkins-bot:
[mediawiki/core@master] resourceloader: support tracking indirect module dependency paths via BagOStuff

https://gerrit.wikimedia.org/r/519741

Change 591388 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[operations/mediawiki-config@master] Enable $wgResourceLoaderUseObjectCacheForDeps for testwiki/test2wiki

https://gerrit.wikimedia.org/r/591388

Change 596696 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[operations/mediawiki-config@master] Enable $wgResourceLoaderUseObjectCacheForDeps for Beta Cluster

https://gerrit.wikimedia.org/r/596696

Change 596696 merged by jenkins-bot:
[operations/mediawiki-config@master] Enable $wgResourceLoaderUseObjectCacheForDeps for Beta Cluster

https://gerrit.wikimedia.org/r/596696

Change 591388 merged by jenkins-bot:
[operations/mediawiki-config@master] Enable $wgResourceLoaderUseObjectCacheForDeps for testwiki/test2wiki

https://gerrit.wikimedia.org/r/591388

Mentioned in SAL (#wikimedia-operations) [2020-05-16T17:49:25Z] <Krinkle> krinkle@mwmaint1002: Running cleanupRemovedModules.php to prune old module_deps rows T113916

Krinkle added a comment.EditedSat, May 16, 10:37 PM

With T252945 in fresh memory, I figured I'd measure the size of the data.

$ foreachwiki mysql.php -- -e 'SELECT SUM(LENGTH(md_deps)) FROM module_deps LIMIT 1;' | tee sum_len_md_deps.log
$ cat sum_len_md_deps.log | grep -E ' \d' | cut -d':' -f2 | sort -nr
  54030580
  41387066
  26092947
  21512320
  19007285
  14817439
  14417541
  14262574
  13003985
  12375295
  11551918
  9890952
  9761510
  9370569
  9069648
  8866616
  7975923
  7893188
  7750407
  7344726
  7131004
  6983012
  6577260
  6547808
  6242845
  6058073
  5948446
  5749254
  5697395
  5555347
  5447465
  5430159
  5267795
  5167557
  5134440
  5083149
  5055733
  4947383
  4927049
  4915445
  4912533
  4729311
  4543028
  4465091
  4406999
  4372184
  4349138
  4263134
  4216692
  4210804
  4166047
  4158580
  4116897
  4106736
  4056818
  3967823
  3775432
  3730321
  3677109
  3652283
  3651328
  3580369
  3578615
  3564137
  3541948
  3523754
  3500043
  3461331
  3412785
  3392040
  3387307
  3348026
  3340344
  3324441
  3273413
  3242790
  3226985
  3181596
  3127941
  3114936
  3082504
  3030944
  2980344
  2899500
  2879007
  2826075
  2815272
  2753153
  2749836
  2665111
  2626762
  2624422
  2615105
  2606131
  2598139
  2515757
  2493693
  2483965
  2477834
  2464640
  2444317
  2441889
  2414784
  2396959
  2388864
  2387132
  2351250
  2332838
  2301259
  2299767
  2298792
  2281616
  2272844
  2269252
  2266683
  2234966
  2219285
  2204627
  2200973
  2184963
  2161081
  2127392
  2093915
  2050962
  2047276
  2046763
  2040154
  2023501
  2013946
  2006284
  2000641
  1993902
  1987436
  1964954
  1959928
  1954852
  1952979
  1932661
  1922577
  1916785
  1899417
  1884378
  1873953
  1867376
  1833996
  1816899
  1808457
  1802558
  1798640
  1796262
  1792219
  1787891
  1779001
  1768398
  1767936
  1766950
  1765383
  1763840
  1752484
  1740087
  1732507
  1711417
  1704139
  1693397
  1692638
  1686655
  1682247
  1649658
  1624668
  1620208
  1612343
  1610470
  1608322
  1604476
  1604226
  1594274
  1589801
  1570156
  1569972
  1568991
  1567056
  1561489
  1557338
  1556598
  1554962
  1551363
  1550619
  1548643
  1547665
  1540183
  1526539
  1510320
  1503840
  1485509
  1481388
  1480921
  1470280
  1463026
  1447926
  1444144
  1437966
  1426704
  1423287
  1417154
  1416835
  1416200
  1415640
  1403681
  1399462
  1395759
  1394471
  1393515
  1392426
  1388374
  1383247
  1382824
  1377784
  1375962
  1372432
  1365178
  1364719
  1358817
  1355744
  1354526
  1351039
  1342600
  1342328
  1338006
  1328197
  1325801
  1320852
  1312123
  1309040
  1298778
  1297149
  1293863
  1292979
  1289978
  1287889
  1286522
  1277319
  1262836
  1260241
  1253840
  1252850
  1244344
  1243070
  1241882
  1239901
  1237965
  1237508
  1236748
  1235781
  1234854
  1231493
  1230979
  1222300
  1222088
  1218140
  1216332
  1210594
  1206158
  1201048
  1201013
  1199985
  1198886
  1193018
  1189563
  1187780
  1187162
  1182977
  1182036
  1179768
  1178324
  1170206
  1164213
  1160034
  1156215
  1150302
  1145231
  1140021
  1139432
  1139347
  1138385
  1134398
  1130283
  1128631
  1127824
  1124797
  1123231
  1120810
  1115669
  1114562
  1110217
  1110002
  1108924
  1103999
  1102941
  1095932
  1090253
  1089075
  1088017
  1087903
  1087872
  1086003
  1085640
  1085525
  1083610
  1083118
  1079765
  1079605
  1079076
  1077349
  1072605
  1070564
  1069346
  1066797
  1064510
  1063599
  1062973
  1062777
  1060218
  1059663
  1057861
  1057670
  1057643
  1055321
  1054789
  1053830
  1049245
  1047916
  1044945
  1044357
  1042859
  1041388
  1040976
  1037012
  1030441
  1027953
  1026453
  1025255
  1022724
  1019494
  1016157
  1015961
  1009648
  1008502
  1005196
  1004934
  1004541
  1002830
  1001569
  1001003
  998793
  998753
  995261
  992575
  989787
  986962
  986413
  985465
  983432
  982969
  980423
  980378
  980303
  977538
  976784
  973155
  972247
  972058
  969058
  968686
  965769
  959118
  957501
  955978
  950515
  950232
  949234
  949046
  948185
  947364
  947292
  946765
  943035
  941139
  940121
  936960
  933393
  931979
  929705
  929302
  927696
  922878
  921030
  920692
  919752
  918550
  915763
  915498
  915460
  911978
  910305
  910118
  907912
  906935
  906631
  906366
  905280
  901330
  900347
  900156
  895076
  894679
  893257
  892336
  889230
  888246
  888202
  884384
  881125
  879907
  872995
  868760
  867096
  866133
  863329
  863115
  862437
  862075
  861824
  860562
  859177
  858541
  854226
  853994
  850439
  848767
  848550
  844075
  842388
  840748
  839967
  837776
  833966
  833795
  833545
  830796
  829110
  829063
  827324
  824966
  824504
  823879
  822499
  817101
  816908
  816309
  815758
  811279
  810751
  809409
  807099
  806931
  804293
  801235
  793603
  790705
  790012
  789342
  785871
  785263
  785187
  783574
  780183
  779736
  779522
  774615
  773765
  772866
  772737
  767916
  767301
  767141
  763655
  762829
  758640
  757817
  753913
  752269
  746906
  746829
  746732
  744990
  743593
  743299
  742686
  742594
  741691
  735576
  733914
  732153
  730995
  730981
  730796
  729238
  727155
  724594
  723201
  723117
  722284
  718991
  718399
  717919
  715261
  714570
  714482
  714305
  713984
  713230
  712758
  711017
  710689
  707257
  706978
  706877
  704923
  704815
  703958
  702847
  702800
  696033
  693136
  690478
  689115
  686893
  686603
  685630
  685457
  683166
  681563
  679556
  679271
  678288
  677745
  676320
  675425
  675263
  675256
  673063
  671268
  670489
  667594
  666979
  665006
  664130
  661004
  659811
  659517
  658137
  652214
  650917
  650461
  648866
  647953
  647865
  647862
  647405
  646747
  644282
  641924
  641568
  637923
  637156
  636632
  635751
  635143
  634508
  634434
  634224
  633558
  633419
  632558
  632492
  629683
  629352
  629077
  628144
  627119
  625835
  625305
  625269
  624918
  624468
  622528
  619748
  619023
  618942
  618427
  617762
  617304
  616754
  613487
  613259
  612836
  612210
  611810
  610374
  610129
  609758
  609507
  608889
  606710
  605907
  603806
  601824
  601354
  600631
  597403
  597358
  597181
  594748
  590063
  588545
  587104
  586038
  584849
  584047
  583342
  582152
  581983
  580187
  579498
  575551
  574883
  573175
  571914
  570763
  570331
  568870
  568463
  568080
  566274
  561554
  561028
  559571
  558655
  558023
  557186
  557167
  555704
  553714
  552749
  552508
  548315
  547792
  546931
  545011
  544432
  543561
  541373
  541257
  539956
  534993
  533377
  532115
  530818
  529655
  525827
  525799
  524587
  524452
  524144
  522777
  522086
  521839
  521770
  521536
  520589
  519692
  518641
  517830
  516336
  515985
  514732
  514723
  512049
  511631
  510914
  509919
  506100
  505977
  504366
  502692
  499329
  499206
  496935
  491171
  489929
  489667
  487271
  486315
  484932
  482385
  480273
  479775
  477718
  476821
  474127
  471841
  471284
  469951
  469910
  469798
  469643
  467528
  467085
  466494
  465670
  465178
  462984
  461388
  460770
  460463
  456440
  456156
  453214
  453098
  452319
  452239
  450028
  449850
  444655
  442508
  441467
  440997
  437935
  437195
  436510
  434517
  434100
  431500
  429648
  428836
  428587
  428513
  426349
  426150
  424309
  421098
  420250
  418167
  417283
  416982
  414086
  413740
  413182
  412194
  410006
  409880
  406115
  405015
  404525
  404490
  403591
  402200
  401424
  400957
  399250
  399003
  398342
  396454
  395664
  394403
  389988
  389448
  388241
  387660
  387200
  386379
  385176
  383516
  383508
  382161
  381219
  378690
  374807
  374506
  374461
  371894
  370001
  369658
  366659
  365517
  363595
  362775
  356684
  353466
  352043
  350902
  350848
  349886
  348986
  348569
  347844
  346958
  345309
  344862
  344564
  343746
  342050
  340248
  337455
  336756
  330751
  330097
  329547
  323797
  320542
  319914
  319867
  315646
  313530
  311897
  311110
  309028
  307965
  306177
  306008
  304878
  303848
  301930
  301405
  300928
  300476
  298823
  292457
  287598
  287302
  286719
  280513
  277871
  275494
  273312
  268750
  267027
  265063
  263242
  256153
  254118
  249167
  247938
  241584
  236462
  235601
  234407
  233163
  230372
  227906
  225215
  224196
  224157
  224110
  223930
  221581
  221260
  220848
  220199
  220143
  218356
  216603
  214663
  214056
  213809
  213038
  212293
  211577
  211199
  210738
  209415
  208088
  205886
  205581
  204813
  203195
  202386
  199960
  199928
  198934
  197651
  196968
  195998
  193811
  189896
  188689
  185319
  183651
  182239
  181986
  178762
  178703
  176952
  176357
  175494
  171531
  170543
  162227
  160994
  152880
  149729
  149719
  149675
  148638
  135813
  134114
  132898
  128033
  126254
  117906
  117407
  111727
  104949
  103688
  100197
  89060
  84832
  83697
  80691
  72932
  72005
  66608
  65884
  62520
  61386
  60714
  59521
  56760
  56151
  56066
  51622
  48819
  42964
  37032
$ cat sum_len_md_deps.log | grep -E ' \d' | cut -d':' -f2 | awk '{s+=$1}END{print s}'
1330942793

On the most active and skin/language-diverse wikis it typically holds about 20MB of data over 150K rows. On the least-active/closed wikis almost no data at 37KB over a hundred rows or so. The total for all wikis combined is currenly about 1.2GB. The theoretical max for current usage is about 20GB, e.g. if every wiki lazy-computes ~ 20MB of data, or 46GB if every wiki does what commons/wikidata do and need ~50MB of data.

Change 597171 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] resourceloader: Make DepStore write lock specific to the current wiki

https://gerrit.wikimedia.org/r/597171

Change 597171 merged by jenkins-bot:
[mediawiki/core@master] resourceloader: Make DepStore write lock specific to the current wiki

https://gerrit.wikimedia.org/r/597171