Page MenuHomePhabricator

Sentry running into infinite loop on beta.wmflabs.org
Open, HighPublic

Description

When trying to open a page like http://de.wikipedia.beta.wmflabs.org/wiki/Module:DateTime something is contacting http://sentry-beta.wmflabs.org/api/4/store/ ........ within an endless loop. Script is blocking execution and termination of page loading. No debugging possible because of endless loop. Only way to stop is closing browser tab. No more details available.

Event Timeline

PerfektesChaos raised the priority of this task from to Needs Triage.
PerfektesChaos updated the task description. (Show Details)
PerfektesChaos added a project: Sentry.
PerfektesChaos added a subscriber: PerfektesChaos.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 6 2015, 8:27 AM

FF 39.0, btw. IE10 same story.

Also on regular wikitext pages, but not on special pages.

FF script stop does not help: When stopping execution once, the same loop is restarted some milliseconds later and again playing ping pong with sentry-beta.wmflabs.org. Alternating load process of two small scripts on sentry, which call and load each other or something like that, again and again.

Caught one abbreviated message on dying:
http://de.wikipedia.beta.wmfla…version=xkG2WRoS line 4 > eval:1

Tgr added a subscriber: Tgr.Aug 6 2015, 5:15 PM

Can't reproduce this on FF 39 on Linux (nor Chrome 44), I get no error whatsoever.

Either something is generating an endless sequence of errors, or the error reporting code is throwing an error itself. The first one is T526, the second should be easy to fix.

Tgr added a comment.Aug 6 2015, 10:26 PM

I can reproduce with your userscripts loaded. The error is Uncaught Error: module already implemented: user. Sentry works normally when I generate an error as an anonymous user so probably one of your user scripts interferes with it somehow. Unfortunately the error does not occur in debug mode (not surprising since user script loading works completely differently) and in normal mode the debugger is fairly useless in both FF and Chrome.

Change 229986 had a related patch set uploaded (by Gergő Tisza):
Limit Sentry calls to 5 per page load

https://gerrit.wikimedia.org/r/229986

Thank you for the limitation of infinity.

However, the reason still needs to be explored.

  • There were no changes in site JS or my user scripts during the recent weeks.
  • By T107399 the loading and execution of user module has been influenced.
  • Many of the gadgets and site scripts involved are providing a clause which ensures that preferences and complex preconditions set by standard user scripts will be obeyed:
    • mw.loader.using( "user", go );
    • That will change the state of the user module, if executed in advance, but not implementing the script part.
    • The error at mw.loader.implement() would require a script component in addition to the state; see // Check for duplicate implementation
  • I do swear that there is no site script that will implement a user module, even more not written some days ago.

I got the impression that the changed async loading procedure is calling itself multiple times, but I do have no idea how and why.

Just an idea:

Could it occur that mw.loader.using( "user", go ); within a site script or gadget might trigger mw.loader.implement( "user" ) right now, and later the regular startup procedure is unconditionally asking for mw.loader.implement( "user" ) once again?

Should be resolved quite soon.

@Krinkle: Any idea where Uncaught Error: module already implemented: user came from?

Should be investigated and avoided. Race condition?

The "Uncaught Error: module already implemented: user" was an intermittent error for some users last week (mostly on August 6). It was short-lived and has been fixed and deployed since (T108275, https://gerrit.wikimedia.org/r/230018)

Change 229986 merged by jenkins-bot:
Limit Sentry calls to 5 per page load

https://gerrit.wikimedia.org/r/229986

Tgr added a comment.Sep 1 2015, 11:50 PM

Does not seem to be related to RL module loading errors; just ran into this behavior on beta enwiki.

Tgr triaged this task as High priority.Sep 1 2015, 11:50 PM
Tgr removed a project: Patch-For-Review.
Tgr set Security to None.