bzimport set Reference to bz65424.
He7d3r created this task.Via LegacyMay 16 2014, 10:47 PM
gerritbot added a comment.Via ConduitMay 16 2014, 11:44 PM

content hidden as private in Bugzilla

gerritbot added a comment.Via ConduitMay 16 2014, 11:46 PM

Change 133871 had a related patch set uploaded by Krinkle:
[wmf debug] resourceloader: Output servedBy when load.php has an error

https://gerrit.wikimedia.org/r/133871

gerritbot added a comment.Via ConduitMay 16 2014, 11:48 PM

content hidden as private in Bugzilla

matmarex added a comment.Via ConduitMay 16 2014, 11:49 PM

[23:19] <greg-g> now I just need to figure out what is causing: https://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28technical%29#Skin_and_gadget_issues_16_May_2014
[23:24] <MatmaRex> huh, i reproduced it
[23:24] <MatmaRex> and looking at this URL: https://bits.wikimedia.org/en.wikipedia.org/load.php?debug=false&lang=en&modules=ext.echo.badge%7Cext.gadget.BugStatusUpdate%2CDRN-wizard%2CReferenceTooltips%2CWatchlistChangesBold%2Ccharinsert%2Cedittop%2CmySandbox%2CrefToolbar%2Csearch-new-tab%2Cteahouse%7Cext.geshi.language.css%2Chtml4strict%2Cjavascript%2Ctext%7Cext.geshi.local%7Cext.uls.nojs%7Cext.visualEditor.viewPageTarget.noscript%7Cext.wikihiero%7Cmediawiki.legacy.commonPrint%2Cshared%7Cmediawiki.skinning.interface%7Cmediawiki.ui.button%7Cmw.PopUpMediaTransform%7Cskins.vector.styles%7Cwikibase.client.init&only=styles&skin=vector&*
[23:24] <MatmaRex> i see a debug comment
[23:24] <MatmaRex> Problematic modules: {"ext.gadget.BugStatusUpdate":"missing","ext.gadget.DRN-wizard":"missing","ext.gadget.ReferenceTooltips":"missing","ext.gadget.WatchlistChangesBold":"missing","ext.gadget.charinsert":"missing","ext.gadget.edittop":"missing","ext.gadget.mySandbox":"missing","ext.gadget.refToolbar":"missing","ext.gadget.search-new-tab":"missing","ext.gadget.teahouse":"missing"}
[23:24] <MatmaRex> so, gadgets have magically disappeared

[23:26] <ori> they're in the startup module
[23:26] <ori> mw.loader.getState('ext.gadget.teahouse')
[23:26] <ori> > "ready"
[23:27] <ori> nothing missing for me
[23:27] <ori> in that url
[23:28] <MatmaRex> ori: took a few page refreshes for me
[23:28] <ori> oh yeah
[23:28] <MatmaRex> i refreshed that URL now and it loaded right
[23:28] <ori> i got it now

[00:49] <greg-g> ori: MatmaRex odder can still repro
[00:50] greg-g just asked Krinkle to take a look now that he's back online
[00:51] <MatmaRex> yeah, i can reproduce still too
[00:52] <MatmaRex> seems to happen randomly, like 20% of time when i load an uncached URL
[00:53] <MatmaRex> all gadget modules are "missing"

matmarex added a comment.Via ConduitMay 16 2014, 11:50 PM

[01:32] <MatmaRex> Krinkle: unless you have better ideas, i'd check if Gadget::loadStructuredList is sometimes returning null when it shouldn't be, and if yes, why is it doing that
[01:33] <Krinkle> MatmaRex: yeah, I'm mw-evalling now
[01:35] <Krinkle> Gadget::loadStructuredList() and the underlying memcached object is fine
[01:35] <Krinkle> at least not critical, let me inspect it

matmarex added a comment.Via ConduitMay 16 2014, 11:53 PM

The summary as of right now is that Gadgets' generated modules are sometimes missing in load.php (and thus gadgets are not loaded) and we don't know why, but top people are on it. :P

It doesn't seem to affect any other modules nor any of Gadget's UI (like the special page or preferences).

gerritbot added a comment.Via ConduitMay 16 2014, 11:55 PM

Change 133871 merged by jenkins-bot:
[wmf debug] resourceloader: Output servedBy when load.php has an error

https://gerrit.wikimedia.org/r/133871

Krinkle added a comment.Via ConduitMay 17 2014, 12:20 AM

Of the 4 application servers for bits in eqiad (mw1149, mw1150, mw1151, mw1152), I've identified mw1151 as the problematic one.

Using requests like the following https://bits.wikimedia.org/en.wikipedia.org/load.php?debug=false&lang=en&modules=ext.echo.badge%7Cext.gadget.BugStatusUpdate%2CDRN-wizard%2CReferenceTooltips%2CWatchlistChangesBold%2Ccharinsert%2Cedittop%2CmySandbox%2CrefToolbar%2Csearch-new-tab%2Cteahouse%7Cext.geshi.language.css%2Chtml4strict%2Cjavascript%2Ctext%7Cext.geshi.local%7Cext.uls.nojs%7Cext.visualEditor.viewPageTarget.noscript%7Cext.wikihiero%7Cmediawiki.legacy.commonPrint%2Cshared%7Cmediawiki.skinning.interface%7Cmediawiki.ui.button%7Cmw.PopUpMediaTransform%7Cskins.vector.styles%7Cwikibase.client.init&only=styles&skin=vector&*bust123

(keep changing the bust query to make different cache misses).

Ori noticed mw1151 has as CPU spike in ganglia and disk issues.

I confirmed via mwscript on the local apache that its memcached is unable to retreive or store a value for cache key 'enwiki:gadgets-definition:7', from Gadget::loadStructuredList()

We should depool that node and have ops look into it.

ori added a comment.Via ConduitMay 17 2014, 1:19 AM

Depooled. There is no indication that this was caused by a software fault, so I'm closing this bug as resolved. Once we have the full story on what happened to that server, a postmortem will be posted to https://wikitech.wikimedia.org/wiki/Incident_documentation.

He7d3r added a comment.Via ConduitJun 4 2014, 5:50 PM

Thanks! :-)

Add Comment