Page MenuHomePhabricator

Frivolous Jenkins failures for Selenium due to DB error
Closed, ResolvedPublic

Description

22:23:44       Query: SHOW MASTER STATUS
22:23:44       Function: DatabaseMysqlBase::getMasterPos
22:23:44       Error: 1227 Access denied; you need (at least one of) the SUPER,REPLICATION CLIENT privilege(s) for this operation (127.0.0.1:3306)

https://integration.wikimedia.org/ci/job/mwext-mw-selenium/9571/console

(Buried in a parse error where HTML wrapping that error is attempted to be parsed as JSON)

Event Timeline

Catrope created this task.Aug 29 2016, 10:34 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 29 2016, 10:34 PM
Ladsgroup triaged this task as Unbreak Now! priority.Aug 30 2016, 2:26 PM
Restricted Application added subscribers: Jay8g, Luke081515, TerraCodes. · View Herald TranscriptAug 30 2016, 2:26 PM

All of jenkin jobs in Wikidata fails, we can't merge anything there.

hoo added a subscriber: hoo.Aug 30 2016, 3:24 PM

Note: a94fe6c634780cd203ea79287b61966bacfbfdae should be reverted once this is fixed.

greg added a subscriber: greg.Aug 31 2016, 3:48 PM

Does that mean that you see things working again?

Each examples ran on different slaves. Looks like the permissions for the database are off on the CI slaves.

Looks like the user we create does not have permission to execute SHOW MASTER STATUS; I can not remember how the privileges are set, maybe via a slave script in integration/jenkins.git or via MediaWiki itself.

I highly suspect that something has change in MediaWiki which causes some code path to look for the database slave even when there are no database slave configured.

Another possibility is that mysql got magically upgraded on the CI slaves (via unattended upgrade) and whatever permission needed got dropped while the database got upgraded by the .deb package postinst.

Jenkins doesn't fail on Wikibase tests now, so maybe we can lower the priority now.

Wikibase got "fixed" by skipping all the impacted tests.

The job running on the CI slaves failed at least twice for Echo on https://gerrit.wikimedia.org/r/#/c/307442/ .

https://integration.wikimedia.org/ci/job/mwext-mw-selenium/ / https://integration.wikimedia.org/ci/job/mwext-mw-selenium/buildTimeTrend show that is apparently passing fine now. The last failure was probably build 9609 at 22:30 UTC yesterday. Maybe the fault was in MediaWiki?

So yes we can probably lower the priority

If we could find a way to repro or a patch that highlight the issue and/or narrow down the source of failure, it would be quite nice.

greg lowered the priority of this task from Unbreak Now! to High.Aug 31 2016, 4:39 PM
hashar closed this task as Resolved.Nov 17 2016, 9:38 AM
hashar claimed this task.

Wikibase change a94fe6c634780cd203ea79287b61966bacfbfdae has been reverted and eventually passed https://gerrit.wikimedia.org/r/#/c/307696/

I have checked the tip of the master branch, the scenario are still flagged with the integration tag and thus selected by the mwext-mw-selenium* jobs.

I have no idea what went wrong. Lets claim it was a one off failure that magically solved itself.