Page MenuHomePhabricator

Changes to module dependencies are not propagating to load.php startup manifest
Closed, ResolvedPublic

Description

During set up of TemplateWizzard as a gadget at nowiki an entry in MediaWiki:Gadgets-definition was defined wit empty dependencies. The actual entry was like this, which created an error report in the console, as it should be like this. That is without dependecies. Still, even after fixing this the error message persist

Error: "Unknown dependency: "
  sort Dependencies https://no.wikipedia.org/w/load.php?debug=false&lang=nb&modules=startup&only=scripts&skin=vector:91:815
  sort Dependencies https://no.wikipedia.org/w/load.php?debug=false&lang=nb&modules=startup&only=scripts&skin=vector:92:402
  resolve Stubbornly https://no.wikipedia.org/w/load.php?debug=false&lang=nb&modules=startup&only=scripts&skin=vector:92:720
  load https://no.wikipedia.org/w/load.php?debug=false&lang=nb&modules=startup&only=scripts&skin=vector:103:675
  <anonymous> https://no.wikipedia.org/w/index.php?title=Hamarkameratene&action=edit:9:4
  start Up https://no.wikipedia.org/w/load.php?debug=false&lang=nb&modules=startup&only=scripts&skin=vector:85:42
  onload https://no.wikipedia.org/w/load.php?debug=false&lang=nb&modules=startup&only=scripts&skin=vector:108:570
  load.php:86:202

I have checked if this creates the same error at sewiki, but it does not.

If a page is edited with debug=1 then the error goes away.

My guess is that the error is thrown because of an earlier cached value, and it happens before the load script can refresh the same cache. If the debug flag is set, then the cache is not used, and no error is reported.


The actual error as reported

Event Timeline

jeblad created this task.Aug 10 2018, 2:09 PM
Restricted Application added a project: Performance-Team. · View Herald TranscriptAug 10 2018, 2:09 PM
Restricted Application added subscribers: Danmichaelo, Aklapper. · View Herald Transcript
jeblad triaged this task as High priority.Aug 10 2018, 2:12 PM

Setting this as high as it could (?) interfere with other scripts loaded at the same time.

jeblad updated the task description. (Show Details)Aug 10 2018, 2:17 PM
Tgr added a subscriber: Tgr.Aug 10 2018, 4:48 PM

I don't see that error, can you provide more specific reproduction steps?

Given that I wipe all gadgets on w:no:Spesial:Innstillinger#mw-prefsection-gadgets
and add the gadget "TemplateWizzard: legger til et verktøy for enkel innsetting av maler."
When I open for example w:no:Hamarkameratene for editing
Then the toolbar should have a puzzle piece to open the TemplateWizzard

What actually happen is that there are no puzzle piece, but there is an error entry in the console.

Now try to access the link w:no:Hamarkameratene for editing in debug mode, then the console error goes away and the puzzle piece appears in the toolbar.

jeblad updated the task description. (Show Details)Aug 10 2018, 6:18 PM
Krinkle added a subscriber: Krinkle.EditedAug 10 2018, 7:55 PM

@jeblad When you say "wipe", do you mean "disable all check boxes and save preferences", or do you mean "edit MediaWiki:Gadgets-definition and delete all gadgets"?

(The same applies to "add the gadget" – creating the gadget, or enabling the gadget.)

"disable all check boxes and save preferences", and "enabling the gadget"

But just try it, it is faster than asking me.

jeblad lowered the priority of this task from High to Normal.Aug 11 2018, 9:28 PM

The error itself has gone away, but I believe there is a real bug here, so I will not close the report. I have altough lowered the priority to "normal".

I suspect this may've been due to T199416 which is now fixed.

@jeblad If you see it again, do let us know. Otherwise, w'll close this by Monday.

jeblad added a comment.EditedAug 17 2018, 2:20 PM

The error isn't there now.

I had this error about 10 hours after I created this task, but it was gone at about 25 hours later. I'm pretty sure I checked it when I wrote T201686#4495861, but if I didn't then it could have been T199416. It could also have gone away because the change triggered a reload.

Not quite sure, could be the same, could also be something else. I don't want to recreate the error state on a live wiki just to check if the bug is still there. ;)

Can you try it on https://test.wikipedia.org? I've just granted you sysop there for 1 year.

Wondered why the heck I was admin there… ;)

jeblad added a comment.EditedAug 18 2018, 2:06 PM

Created the same entry as this change at MediaWiki:Gadgets-definition, which triggered the bug, and 20 minutes later it was still there while at 45 minutes later it was gone. It should although be gone when the entry was updated.

@jeblad So to clarify, the issue is not that the warning happens (given "" is an invalid dependency). The issue is that, when fixing the issue on Gadgets-definition, the change does not happen on normal pages until 1 hour later, instead of being updated within the normal 5-minute update cycle. Correct?

If so, that is indeed a bug, and sounds very related to T199416. I can confirm that the 5-minute update cycle does generally work correctly, but there might be something specific to either ResourceLoaderWikiModule or the Gadget subclass for it that is caching it for too long, or not getting purged correctly.

In short; yes.

Exact sequence to trigger the bug

Given I have added a faulty entry like * template-wizzard[ResourceLoader|rights=edit|type=general|dependencies=]|template-wizzard.js
and the JS-page exists (and so forth…)
When I fix the faulty entry like * template-wizzard[ResourceLoader|rights=edit|type=general]|template-wizzard.js
Then I expect any error to go away
and the page to load as usual

What happen is that the error goes away 20-45 minutes after the entry is corrected.

My guess; something catches the bug before the normal purge is reached.

Imarlier assigned this task to Krinkle.Aug 20 2018, 8:07 PM
Imarlier moved this task from Inbox to Doing on the Performance-Team board.
Before
Gadgets-definition
metadata[ResourceLoader|dependencies=mediawiki.util]|metadata.js|metadata.css
startup
... ["ext.gadget.metadata","1qjy88h",[146],"site"] ... 
//> 146 correlates with "mediawiki.util"
server-console
$ mwscript eval.php --wiki test2wiki
> $gr = GadgetRepo::singleton() var_dump($gr->getGadget('metadata')->getDependencies());
array(1) {
  [0]=>
  string(14) "mediawiki.util"
} 
> echo get_class(ObjectCache::getLocalServerInstance('hash'));
APCBagOStuff

After

After editing MediaWiki:Gadgets-definition to remove [dependencies=mediawiki.util]. I find that the server-side immediately reflects the update.

$ mwscript eval.php --wiki test2wiki
> $gr = GadgetRepo::singleton(); var_dump($gr->getGadget('metadata')->getDependencies());
array(0) {
}

This shows that GadgetHooks::onPageContentSaveComplete is working as intended. It touches the WANObjectCache checkKey, and both the per-cluster cache (Memcached) and per-server cache (APC) are correctly being validated and updated or bypassed automatically.

However, over HTTP in the browser it keeps showing the stale version in the startup module. It wasn't until I manually bypassed the cache (with a random query parameter) that I got the updated result.

startup&_random4
... ["ext.gadget.metadata","1qjy88h",[],"site"] ...

Normally I would stop here. It shows everything seems to be working fine. The caches should roll over automatically every 5 minutes for ResourceLoader, and I've tested this often enough to know that this is working fine today in production. Except when I checked back half an hour later, it was still stale in Varnish.

I tried to reproduce the same issue again, and indeed it stayed stale again. I found the culprit when looking at the headers

Before: startup
200 OK
...
etag: W/"1ffgsq9"
...
... ["ext.gadget.metadata","1qjy88h",[146],"site"] ...
After: startup
200 OK
...
etag: W/"1ffgsq9"
...
... ["ext.gadget.metadata","1qjy88h",[],"site"] ...

It seems that while the registry contents have changed, the ETag computed by the startup module remained the unchanged. This a bug and this bug is not limited to Gadgets, and it is not limited to dependencies being set to empty.

This bug affects any change to dependencies, for any module in the system.

Krinkle renamed this task from Cache issue in load.php to Changes to module dependencies are not propagating to load.php startup manifest .Aug 29 2018, 12:51 AM

Change 456326 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] resourceloader: Remove selective build optimisation from getModuleContent()

https://gerrit.wikimedia.org/r/456326

Change 456327 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] resourceloader: Use 'enableModuleContentVersion' for startup module

https://gerrit.wikimedia.org/r/456327

Change 456326 merged by jenkins-bot:
[mediawiki/core@master] resourceloader: Remove selective build optimisation from getModuleContent()

https://gerrit.wikimedia.org/r/456326

@Krinkle Nice investigation! =)

Change 456327 merged by jenkins-bot:
[mediawiki/core@master] resourceloader: Use 'enableModuleContentVersion' for startup module

https://gerrit.wikimedia.org/r/456327

Change 458059 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@wmf/1.32.0-wmf.20] resourceloader: Use 'enableModuleContentVersion' for startup module

https://gerrit.wikimedia.org/r/458059

Change 458059 merged by jenkins-bot:
[mediawiki/core@wmf/1.32.0-wmf.20] resourceloader: Use 'enableModuleContentVersion' for startup module

https://gerrit.wikimedia.org/r/458059

Mentioned in SAL (#wikimedia-operations) [2018-09-05T01:42:52Z] <krinkle@deploy1001> Synchronized php-1.32.0-wmf.20/includes/resourceloader/: I8e8d3a2cd2cc - T201686 (duration: 01m 31s)

Krinkle closed this task as Resolved.Sep 5 2018, 1:44 AM
Krinkle removed a project: Patch-For-Review.
Krinkle moved this task from Blocked or Needs-CR to Doing on the Performance-Team board.

This change will be deployed this week as part of 1.32.0-wmf.20. See also T191066 and https://www.mediawiki.org/wiki/MediaWiki_1.32/Roadmap.