Page MenuHomePhabricator

The error "module already implemented: (name)" error should not be logged in client side error handling
Closed, ResolvedPublic

Description

NOTE: previously T262493 but the scope of that has grown.

The volume of this error is extremely problematic and happening too often to be ignored. It happens at levels that in the past we have considered UBN if it occurred for newly deployed code and adds unnecessary noise to our client-side error logging instrumentation. It seems to occur when mw.loader.implement is called twice on the same module name in a user script because of users loading modules from different sources.

Too many unique IP addresses are triggering this bug for us to put the blame for this on the end-user and this is going to increase considerably when we roll out to English Wikipedia. At the time of writing in a 24hr period, there were 9,782 instances out of 32,079 unresolved errors meaning this accounts for 30% of the errors we track (this excludes errors from browser extensions and non-wikimedia domains).

Replication steps

  • I enable the cat a lot gadget on https://commons.wikimedia.org/wiki/Special:Preferences#mw-prefsection-gadgets
  • I add the line mw.loader.load('//commons.wikimedia.org/w/load.php?modules=ext.gadget.Cat-a-lot'); to my global JS at meta.wikimedia.org/wiki/User:Jdlrobson/global.js as I want the gadget on every wiki I use and some wikis do not have it.
  • Every page view triggers a JS error

Developer notes

One possible solution, if global gadgets are not really supported, would be to update ResourceLoader to include wgSiteName in its call to mw.loader.implement and refuse to register such modules from another domain. If that's too controversial, it should at least not throw an Error in such a case or tag the error in such a way that the client-side error logging can ignore it.

Please determine the best way to remove these errors from the logs or if you think the errors need to be fixed, provide an alternative strategy for doing that.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Jdlrobson updated the task description. (Show Details)

Copying in relevant snippets from the other task:

[…]

If RL's state machine were to break and fetch or execute a module twice or if client storage corrupts in a way we don't detect, that's a serious issue and applies to all modules equally, gadgets or otherwise. Gadgets if anything are more likely to excercise such bugs due to our own code being relatively consistent and simple compared to the unaudited/excotic code out there in the wild.

Like all errors, the eror should be impossible. And in general when using dependencies, or using(), or load(), fetches are naturally coellesced, re-used and de-duplicated. Thus if these standard interfaces are used and have no bugs, the error does not happen. Catching when that goes wrong for whatever reason is hugely important.

In this case, a gadget is fetching source code directly from a load.php endpoint, and then fetching a similar payload from another load.php endpoint, both of which bypass caching and de-duplication, at which point the second one is an error as it can't execute a different payload under the same name. There isn't a way to distinguish that from real errors. And this is in fact a real error. The second payload ended up being rejected and not actually execute, despite the user asking for it. If there was a tolerant way to perform this operation, I don't think it would take off in practice since whover writes the original code would want to have error detection etc if they're a more experienced gadget developer. […]

[…]

[…]
The replication steps are straightforward.

  • I enable the cat a lot gadget on https://commons.wikimedia.org/wiki/Special:Preferences#mw-prefsection-gadgets
  • I add the line mw.loader.load('//commons.wikimedia.org/w/load.php?modules=ext.gadget.Cat-a-lot'); to my global JS at meta.wikimedia.org/wiki/User:Jdlrobson/global.js as I want the gadget on every wiki I use and some wikis do not have it.
  • Now every page view on Commons results in the error Uncaught Error: module already implemented: ext.gadget.Cat-a-lot

Now is this an issue with global gadgets or gadgets, or ResourceLoader I don't know, […]

[…]

In a nut shell, the line of code you show above is invalid. It's not meant to work and doesn't demonstrate any kind of issue with Gadgets or ResourceLoader.

If you want to know unofficially how one could try to load gadgets across wikis, here's a few ways I've seen: […]

@Jdlrobson wrote in the T266720 task description:

[…] update ResourceLoader to include wgSiteName in its call to mw.loader.implement and refuse to register such modules from another domain. If that's too controversial, it should at least not throw an Error in such a case or tag the error in such a way that the client-side error logging can ignore it.

This wouldn't work. There is only one browsing and JS context in any given tab (apart from workers/iframes). Scripts loaded from another domain execute in the tab they are loaded into. There isn't some other hidden context from the origin domain in which it executes, and is thus why XSS concerns exist.

@Jdlrobson wrote in the T266720 task description:

Please determine the best way to remove these errors from the logs or if you think the errors need to be fixed, provide an alternative strategy for doing that.

The steps to reproduce deterministically cause an error because they are invalid. Requesting the same thing twice whilst bypassing the module loader would be a serious bug in any other cirumstance. In terms of how we receive it as Logstash triagers, this should imho be nofrom e.g. many different scripts calling an undefined method or referencing an undefind variable. It needs to either be fixed at the source, or be categorically excluded in some magic way if we can detect the origin of a stack to be user/gadget-related which is the topic of T262493.

I've already laid out four ways to do this in a way that isn't broken, at T262493#6455650.

I've already laid out four ways to do this in a way that isn't broken, at T262493#6455650.

I don't see how this is helpful. How do you plan to get community to act on these guidelines? It's not me that needs to fix it but hundreds of users and it doesn't sound like a user notice is going to help with that approach...

Change 638132 had a related patch set uploaded (by Jdlrobson; owner: Jdlrobson):
[mediawiki/extensions/WikimediaEvents@master] Exclude the module already implemented error

https://gerrit.wikimedia.org/r/638132

I am seeing a pattern in how this happens. It happens when a gadget loads code from meta.wikimedia.org.

I was able to replicate the errror using the following steps but was unable to reproduce once it had happened suggesting caching might be at play:

This pattern appears to be prevalent across the wikipedias, for example consider https://nl.wikipedia.org/wiki/Gebruiker:Daniuu/common.js combined with https://nl.wikipedia.org/wiki/MediaWiki:Gadget-RTRC.js

Reading through that code, it looks like a JS error should never occur (if a gadget is defined on the wiki mw.loader.getState('ext.gadget.rtrc') will always be non null) so I'm a little suspicious there could be a race condition of some kind happening here. @Krinkle the snippet being used appears to your code, any ideas what might be happening?

The gadget is defined on mediawiki.org as rtrc. The snippet on nl.wikipedia.org as registered as RTRC (uppercase). If the nl.wikipedia.org gadget actually had the same name, then the getState check would work exactly as you describe, and thus no error occur. But, if that was the case, the gadget would also cease to work for nl.wikipedia.org as that would mean the proxy gadget will be checking its own state and ask the browser to ensure that itself is being loaded, which it already is. Thus it would do nothing, instead of actually loading the gadget from the other wiki. I guess that's why they named it differently.

Users can and always will take shortcuts to achieve what they want and nothing more. Production quality doesn't apply to user scripts. If someone's script works when they want it to work, then they likely won't have interest or motiviation to invest endlessly in coding against every way it can fail only to make it fail the same way it was already failing, but with more code complexity and less noise in a debug channel they don't care much about (window.onerror). The browser already ensures these errors fail gracefully and in isolation from anything else on the page. It's only us as third-party observes to the user's private JS console errors that are affected by this.

Users will install browser extensions and user scripts that don't always work. And they will take short cuts or widely spread a broken pattern that only works some of the time but enough of the time for that.

As such, I don't think we should spend any further time on this issue nor any other gadget/userscript-specific issue unless it is with the intention to improve a particular gadget's code base for some (other) reason. Let's focus instead on the thresholds, and on how to more generally ignore gadget-induced errors, at T262493.

(Unsupported: ) The behaviour we're seeing is consistent with that would happen if you didn't have the getState-check at all, or if you simply repeated this instruction twice in the same user script, or imported two non-gadget user scripts containing the same code. One of them will fail in that case, as expected, since it bypasses the broser's and RL state management and thus starts two real HTTP script fetches for a same-named module, one of which will fail on arrival, as expected.

(Unsupported: ) The reason I use a getState-check for my gadgets is so that my snippet in Meta-Wiki global.js, which applies to all wikis, also works when browsing the wiki where the script is first uploaded by me (mediawiki.org) without causing a "module already implemented" error in my error console. Because I generally do look for errors there and want to avoid being distracted by this harmless false negative. However if, unlike me, someone else copies my snippet and pastes it twice in their global.js or commons.js or otherwise makes choices causing it to execute twice, then it is likely that it will cause an error. As I described above, I too am user that makes compromises to write something minimal that works for me. In this case, I re-created a very tiny portion of RL's state model (the initial getState) check to get it to do the right thing no matter which wiki I am on. What I didn't bother to do, was inform the browser before I started the HTTPS fetch that I did so, so that a potential duplicate copy of the same snippet would see that it doesn't need to start a fetch.

(Unsupported: ) If I cared for this scenario, or if someone else wants to be able to have two copies enabled and also have a error-free console, they could address it like so:

  mw.loader.getState(…)
    ? mw.loader.load(…);
-   : mw.loader.load('https://elsewhere.example/?module=…');
+   : (mw.loader.state('…':'loading'}) && mw.loader.load('https://elsewhere.example/?module=…'));

Change 638132 merged by jenkins-bot:
[mediawiki/extensions/WikimediaEvents@master] Exclude the module already implemented error for gadgets

https://gerrit.wikimedia.org/r/638132

Jdlrobson claimed this task.