Page MenuHomePhabricator

selenium-daily-beta-WikibaseLexeme broken since mid feb 2021
Open, Needs TriagePublic13 Estimated Story Points

Description

Initial spotted as T277859, this was then fixed and some issues remained

Most recent failure can be seen here https://integration.wikimedia.org/ci/job/selenium-daily-beta-WikibaseLexeme/

Acceptance criteria🏕️🌟x2:

  • The daily test should be green
  • If the tests are not serving a good purpose we should get rid of them

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Addshore set the point value for this task to 13.Apr 28 2021, 10:59 AM

I superficially looked at all tests in question, and I think they all serve a value. The only test that is a little redundant is Lexeme:Lemma can be edited (which is mostly redundant with can be edited multiple times).

Many of the failures seem to be about editing and then immediately trying to fetch the entity from wbgetentities, which might fail due to replication lag on beta. I'll add a workaround for this.

Some failures will be very hard/ impossible to come by, like today's The database has been automatically locked while the replica database servers catch up to the master, so actually attempting to get these test consistently green will be futile (but I suppose we can get them somewhat reliable at least).

Change 688430 had a related patch set uploaded (by Hoo man; author: Hoo man):

[mediawiki/extensions/WikibaseLexeme@master] Selenium: Try to wait for replication before reading from API

https://gerrit.wikimedia.org/r/688430

Change 688430 merged by jenkins-bot:

[mediawiki/extensions/WikibaseLexeme@master] Selenium: Try to wait for replication before reading from API

https://gerrit.wikimedia.org/r/688430

The lexeme tests seem to be green now, although there is another issue causing failure now

https://integration.wikimedia.org/ci/job/selenium-daily-beta-WikibaseLexeme/1010/console

10:54:13 [0-14] RUNNING in chrome - /tests/selenium/specs/special/recentchanges.js
10:54:14 [0-14] 2021-05-11T09:54:14.492Z ERROR @wdio/runner: Error: Unable to load spec files quite likely because they rely on `browser` object that is not fully initialised.
10:54:14 `browser` object has only `capabilities` and some flags like `isMobile`.
10:54:14 Helper files that use other `browser` commands have to be moved to `before` hook.
10:54:14 Spec file(s): /src/tests/selenium/specs/special/recentchanges.js
10:54:14 Error: Error: Cannot find module '../../../../../../tests/selenium/pageobjects/recentchanges.page'
10:54:14     at Function.Module._resolveFilename (internal/modules/cjs/loader.js:636:15)
10:54:14     at Function.Module._load (internal/modules/cjs/loader.js:562:25)
10:54:14     at Module.require (internal/modules/cjs/loader.js:692:17)
10:54:14     at require (internal/modules/cjs/helpers.js:25:18)
10:54:14     at Object.<anonymous> (/src/tests/selenium/specs/special/recentchanges.js:5:22)
10:54:14     at Module._compile (internal/modules/cjs/loader.js:778:30)
10:54:14     at Object.Module._extensions..js (internal/modules/cjs/loader.js:789:10)
10:54:14     at Module.load (internal/modules/cjs/loader.js:653:32)
10:54:14     at tryModuleLoad (internal/modules/cjs/loader.js:593:12)
10:54:14     at Function.Module._load (internal/modules/cjs/loader.js:585:3)
10:54:14     at Module.require (internal/modules/cjs/loader.js:692:17)
10:54:14     at require (internal/modules/cjs/helpers.js:25:18)
10:54:14     at Object.exports.requireOrImport (/src/node_modules/mocha/lib/esm-utils.js:20:12)
10:54:14     at Object.exports.loadFilesAsync (/src/node_modules/mocha/lib/esm-utils.js:33:34)
10:54:14     at Mocha.loadFilesAsync (/src/node_modules/mocha/lib/mocha.js:431:19)
10:54:14     at MochaAdapter._loadFiles (/src/node_modules/@wdio/mocha-framework/build/index.js:52:31)
10:54:14     at MochaAdapter._loadFiles (/src/node_modules/@wdio/mocha-framework/build/index.js:66:35)
10:54:14     at process._tickCallback (internal/process/next_tick.js:68:7)
10:54:14 [0-14]  Error:  Unable to load spec files quite likely because they rely on `browser` object that is not fully initialised.
10:54:14 `browser` object has only `capabilities` and some flags like `isMobile`.
10:54:14 Helper files that use other `browser` commands have to be moved to `before` hook.
10:54:14 Spec file(s): /src/tests/selenium/specs/special/recentchanges.js
10:54:14 Error: Error: Cannot find module '../../../../../../tests/selenium/pageobjects/recentchanges.page'
10:54:14     at Function.Module._resolveFilename (internal/modules/cjs/loader.js:636:15)
10:54:14     at Function.Module._load (internal/modules/cjs/loader.js:562:25)
10:54:14     at Module.require (internal/modules/cjs/loader.js:692:17)
10:54:14     at require (internal/modules/cjs/helpers.js:25:18)
10:54:14     at Object.<anonymous> (/src/tests/selenium/specs/special/recentchanges.js:5:22)
10:54:14     at Module._compile (internal/modules/cjs/loader.js:778:30)
10:54:14     at Object.Module._extensions..js (internal/modules/cjs/loader.js:789:10)
10:54:14     at Module.load (internal/modules/cjs/loader.js:653:32)
10:54:14     at tryModuleLoad (internal/modules/cjs/loader.js:593:12)
10:54:14     at Function.Module._load (internal/modules/cjs/loader.js:585:3)
10:54:14     at Module.require (internal/modules/cjs/loader.js:692:17)
10:54:14     at require (internal/modules/cjs/helpers.js:25:18)
10:54:14     at Object.exports.requireOrImport (/src/node_modules/mocha/lib/esm-utils.js:20:12)
10:54:14     at Object.exports.loadFilesAsync (/src/node_modules/mocha/lib/esm-utils.js:33:34)
10:54:14     at Mocha.loadFilesAsync (/src/node_modules/mocha/lib/mocha.js:431:19)
10:54:14     at MochaAdapter._loadFiles (/src/node_modules/@wdio/mocha-framework/build/index.js:52:31)
10:54:14 [0-14] FAILED in chrome - /tests/selenium/specs/special/recentchanges.js
10:54:16 [0-15] RUNNING in chrome - /tests/selenium/specs/special/watchlist.js
10:54:17 [0-15] 2021-05-11T09:54:17.859Z ERROR @wdio/runner: Error: Unable to load spec files quite likely because they rely on `browser` object that is not fully initialised.
10:54:17 `browser` object has only `capabilities` and some flags like `isMobile`.
10:54:17 Helper files that use other `browser` commands have to be moved to `before` hook.
10:54:17 Spec file(s): /src/tests/selenium/specs/special/watchlist.js
10:54:17 Error: Error: Cannot find module '../../../../../../tests/selenium/pageobjects/watchlist.page'
10:54:17     at Function.Module._resolveFilename (internal/modules/cjs/loader.js:636:15)
10:54:17     at Function.Module._load (internal/modules/cjs/loader.js:562:25)
10:54:17     at Module.require (internal/modules/cjs/loader.js:692:17)
10:54:17     at require (internal/modules/cjs/helpers.js:25:18)
10:54:17     at Object.<anonymous> (/src/tests/selenium/specs/special/watchlist.js:8:18)
10:54:17     at Module._compile (internal/modules/cjs/loader.js:778:30)
10:54:17     at Object.Module._extensions..js (internal/modules/cjs/loader.js:789:10)
10:54:17     at Module.load (internal/modules/cjs/loader.js:653:32)
10:54:17     at tryModuleLoad (internal/modules/cjs/loader.js:593:12)
10:54:17     at Function.Module._load (internal/modules/cjs/loader.js:585:3)
10:54:17     at Module.require (internal/modules/cjs/loader.js:692:17)
10:54:17     at require (internal/modules/cjs/helpers.js:25:18)
10:54:17     at Object.exports.requireOrImport (/src/node_modules/mocha/lib/esm-utils.js:20:12)
10:54:17     at Object.exports.loadFilesAsync (/src/node_modules/mocha/lib/esm-utils.js:33:34)
10:54:17     at Mocha.loadFilesAsync (/src/node_modules/mocha/lib/mocha.js:431:19)
10:54:17     at MochaAdapter._loadFiles (/src/node_modules/@wdio/mocha-framework/build/index.js:52:31)
10:54:17     at MochaAdapter._loadFiles (/src/node_modules/@wdio/mocha-framework/build/index.js:66:35)
10:54:17     at process._tickCallback (internal/process/next_tick.js:68:7)
10:54:17 [0-15]  Error:  Unable to load spec files quite likely because they rely on `browser` object that is not fully initialised.
10:54:17 `browser` object has only `capabilities` and some flags like `isMobile`.
10:54:17 Helper files that use other `browser` commands have to be moved to `before` hook.
10:54:17 Spec file(s): /src/tests/selenium/specs/special/watchlist.js
10:54:17 Error: Error: Cannot find module '../../../../../../tests/selenium/pageobjects/watchlist.page'
10:54:17     at Function.Module._resolveFilename (internal/modules/cjs/loader.js:636:15)
10:54:17     at Function.Module._load (internal/modules/cjs/loader.js:562:25)
10:54:17     at Module.require (internal/modules/cjs/loader.js:692:17)
10:54:17     at require (internal/modules/cjs/helpers.js:25:18)
10:54:17     at Object.<anonymous> (/src/tests/selenium/specs/special/watchlist.js:8:18)
10:54:17     at Module._compile (internal/modules/cjs/loader.js:778:30)
10:54:17     at Object.Module._extensions..js (internal/modules/cjs/loader.js:789:10)
10:54:17     at Module.load (internal/modules/cjs/loader.js:653:32)
10:54:17     at tryModuleLoad (internal/modules/cjs/loader.js:593:12)
10:54:17     at Function.Module._load (internal/modules/cjs/loader.js:585:3)
10:54:17     at Module.require (internal/modules/cjs/loader.js:692:17)
10:54:17     at require (internal/modules/cjs/helpers.js:25:18)
10:54:17     at Object.exports.requireOrImport (/src/node_modules/mocha/lib/esm-utils.js:20:12)
10:54:17     at Object.exports.loadFilesAsync (/src/node_modules/mocha/lib/esm-utils.js:33:34)
10:54:17     at Mocha.loadFilesAsync (/src/node_modules/mocha/lib/mocha.js:431:19)
10:54:17     at MochaAdapter._loadFiles (/src/node_modules/@wdio/mocha-framework/build/index.js:52:31)
10:54:18 [0-15] FAILED in chrome - /tests/selenium/specs/special/watchlist.js

Change 688993 had a related patch set uploaded (by Hoo man; author: Hoo man):

[mediawiki/extensions/WikibaseLexeme@master] Selenium: Copy MediaWiki core pageobjects

https://gerrit.wikimedia.org/r/688993

Moving this back into TODO to give someone else the chance to pick this up (as I wont work much more this week).

Current state is that, we still occasionally(?) see:

14:26:31 [Chrome 90.0.4430.85 linux #0-7] Spec: /src/tests/selenium/specs/lemma.edit.js
14:26:31 [Chrome 90.0.4430.85 linux #0-7] Running: Chrome (v90.0.4430.85) on linux
14:26:31 [Chrome 90.0.4430.85 linux #0-7] Session ID: 9b7c0148-0264-45c0-b0af-25e2ad3b9f75
14:26:31 [Chrome 90.0.4430.85 linux #0-7]
14:26:31 [Chrome 90.0.4430.85 linux #0-7] Lexeme:Lemma
14:26:31 [Chrome 90.0.4430.85 linux #0-7]    ✓ can be edited
14:26:31 [Chrome 90.0.4430.85 linux #0-7]    ✓ can be edited multiple times
14:26:31 [Chrome 90.0.4430.85 linux #0-7]    ✖ can not save lemmas with redundant languages
14:26:31 [Chrome 90.0.4430.85 linux #0-7]
14:26:31 [Chrome 90.0.4430.85 linux #0-7] 2 passing (1m 33.9s)
14:26:31 [Chrome 90.0.4430.85 linux #0-7] 1 failing
14:26:31 [Chrome 90.0.4430.85 linux #0-7]
14:26:31 [Chrome 90.0.4430.85 linux #0-7] 1) Lexeme:Lemma can not save lemmas with redundant languages
14:26:31 [Chrome 90.0.4430.85 linux #0-7] Can't call elementClick on element with selector "#wpLoginAttempt" because element wasn't found
14:26:31 [Chrome 90.0.4430.85 linux #0-7] Error: Can't call elementClick on element with selector "#wpLoginAttempt" because element wasn't found
14:26:31 [Chrome 90.0.4430.85 linux #0-7]     at LoginPage.login (/src/node_modules/wdio-mediawiki/LoginPage.js:17:20)
14:26:31 [Chrome 90.0.4430.85 linux #0-7]     at LoginPage.loginAdmin (/src/node_modules/wdio-mediawiki/LoginPage.js:21:8)
14:26:31 [Chrome 90.0.4430.85 linux #0-7]     at Context.it (/src/tests/selenium/specs/lemma.edit.js:65:13)

and

14:26:31 [Chrome 90.0.4430.85 linux #0-14] Spec: /src/tests/selenium/specs/special/recentchanges.js
14:26:31 [Chrome 90.0.4430.85 linux #0-14] Running: Chrome (v90.0.4430.85) on linux
14:26:31 [Chrome 90.0.4430.85 linux #0-14] Session ID: 0b0314ac-ab8d-4ede-88f7-d9f3b98525c3
14:26:31 [Chrome 90.0.4430.85 linux #0-14]
14:26:31 [Chrome 90.0.4430.85 linux #0-14] Special:RecentChanges
14:26:31 [Chrome 90.0.4430.85 linux #0-14]    ✖ shows lemmas in title links to lexemes on Special:RecentChanges
14:26:31 [Chrome 90.0.4430.85 linux #0-14]
14:26:31 [Chrome 90.0.4430.85 linux #0-14] 1 failing (26s)
14:26:31 [Chrome 90.0.4430.85 linux #0-14]
14:26:31 [Chrome 90.0.4430.85 linux #0-14] 1) Special:RecentChanges shows lemmas in title links to lexemes on Special:RecentChanges
14:26:31 [Chrome 90.0.4430.85 linux #0-14] The expression evaluated to a falsy value:
14:26:31 
14:26:31   assert( title.includes( 'entrôpi' ) )
14:26:31 
14:26:31 [Chrome 90.0.4430.85 linux #0-14] AssertionError [ERR_ASSERTION]: The expression evaluated to a falsy value:
14:26:31 [Chrome 90.0.4430.85 linux #0-14]
14:26:31 [Chrome 90.0.4430.85 linux #0-14]   assert( title.includes( 'entrôpi' ) )
14:26:31 [Chrome 90.0.4430.85 linux #0-14]
14:26:31 [Chrome 90.0.4430.85 linux #0-14]     at Context.it (/src/tests/selenium/specs/special/recentchanges.js:31:3)

Also sometimes tests fail due as The database has been automatically locked while the replica database servers catch up to the master, but that's an issue with beta (which we can hardly address). The above problems could also be related to beta infrastructure issues, but that remains to be investigated.

For reference, last job run: https://integration.wikimedia.org/ci/job/selenium-daily-beta-WikibaseLexeme/1011/console

hoo removed hoo as the assignee of this task.May 11 2021, 1:44 PM
hoo added a subscriber: hoo.

Change 688993 merged by jenkins-bot:

[mediawiki/extensions/WikibaseLexeme@master] Selenium: Copy MediaWiki core pageobjects

https://gerrit.wikimedia.org/r/688993

Still red after running overnight. I think the next things I will investigate are:

  • Summary of load on beta cluster; see if it's obviously over stressed
  • Consider if MWAPI should be using the maxlag parameter and if it should respond nicely to the 503/Retry-After/X-Database-Lag Header
  • Trying to improve on the upstream from wdio core "login" functionality such that it waits in the event the box hasn't fully loaded
  • Try to improve the recent changes test so that it optionally waits until the page has actually loaded before attempting to parse the title
Tarrow added a subscriber: Tarrow.

And stuck back after licking the cookie.

tl;dr on the beta cluster: I didn't see any thing on grafana that looked like excessive load to me

I checked the binlogs for master and it's mostly updating module dependency graph:

enwikiINSERT /* Wikimedia\DependencyStore\SqlModuleDependencyStore::storeMulti  */ INTO `module_deps` (md_module,md_skin,md_deps) VALUES ('mediawiki.ui','minerva|en','[\"resources/src/mediawiki.less/mediawiki.mixins.less\",\"resources/src/mediawiki.less/mediawiki.skin.defaults.less\",\"resources/src/mediawiki.less/mediawiki.ui/variables.less\",\"resources/src/mediawiki.ui/components/forms.less\",\"resources/src/mediawiki.ui/components/utilities.less\",\"skins/MinervaNeue/resources/mediawiki.less/mediawiki.skin.variables.less\"]') ON DUPLICATE KEY UPDATE md_deps = '[\"resources/src/mediawiki.less/mediawiki.mixins.less\",\"resources/src/mediawiki.less/mediawiki.skin.defaults.less\",\"resources/src/mediawiki.less/mediawiki.ui/variables.less\",\"resources/src/mediawiki.ui/components/forms.less\",\"resources/src/mediawiki.ui/components/utilities.less\",\"skins/MinervaNeue/resources/mediawiki.less/mediawiki.skin.variables.less\"]'���1�'�`���S-~���k�'�`���*�S���
                                                     �Uۚ'�`���gX-�,� Tstd?enwikiINSERT /* Wikimedia\DependencyStore\SqlModuleDependencyStore::storeMulti  */ INTO `module_deps` (md_module,md_skin,md_deps) VALUES ('mediawiki.special.search.styles','minerva|en','[\"resources/src/mediawiki.less/mediawiki.mixins.animation.less\",\"resources/src/mediawiki.less/mediawiki.mixins.less\",\"resources/src/mediawiki.less/mediawiki.skin.defaults.less\",\"resources/src/mediawiki.less/mediawiki.ui/variables.less\",\"skins/MinervaNeue/minerva.less/minerva.mixins.less\",\"skins/MinervaNeue/minerva.less/minerva.variables.less\",\"skins/MinervaNeue/resources/mediawiki.less/mediawiki.skin.variables.less\"]') ON DUPLICATE KEY UPDATE md_deps = '[\"resources/src/mediawiki.less/mediawiki.mixins.animation.less\",\"resources/src/mediawiki.less/mediawiki.mixins.less\",\"resources/src/mediawiki.less/mediawiki.skin.defaults.less\",\"resources/src/mediawiki.less/mediawiki.ui/variables.less\",\"skins/MinervaNeue/minerva.less/minerva.mixins.less\",\"skins/MinervaNeue/minerva.less/minerva.variables.less\",\"skins/MinervaNeue/resources/mediawiki.less/mediawiki.skin.variables.less\"]'

I take a look why it's not cached.

Change 691734 had a related patch set uploaded (by Ladsgroup; author: Ladsgroup):

[mediawiki/core@master] resourceloader: Read from replica before reading from primary

https://gerrit.wikimedia.org/r/691734

"Special:RecentChanges shows lemmas in title links to lexemes on Special:RecentChanges" consistently fails. We can disable it for now I assume (or fix it?). With that some tests might become green (maybe on third of them?)

The rest vary really widely and basically at random, from db issues to failing to start the tests altogether, to all sorts of random issues. I want to take a step back. Maybe this should run somewhere but not beta cluster? I honestly don't see much value in it since it has been red for three years now.

Change 692247 had a related patch set uploaded (by Ladsgroup; author: Ladsgroup):

[mediawiki/extensions/WikibaseLexeme@master] Wait for two seconds before opening RC so replication catches up

https://gerrit.wikimedia.org/r/692247

Can we robustly detect replication lag via the API? It might make sense to skip tests (or assertions) in those cases, rather than accepting them to flake because that will train us to ignore the real failures as well.

Change 692247 merged by jenkins-bot:

[mediawiki/extensions/WikibaseLexeme@master] Wait for two seconds before opening RC so replication catches up

https://gerrit.wikimedia.org/r/692247

Can we robustly detect replication lag via the API? It might make sense to skip tests (or assertions) in those cases, rather than accepting them to flake because that will train us to ignore the real failures as well.

Up to a second, yes… but making the tests truly failure proof will be quite some work (catch database autolocked, abort if replag > N seconds, ...).

Moving this to stalled to see if the patch above can make it at least green sometimes.

Change 691734 abandoned by Ladsgroup:

[mediawiki/core@master] resourceloader: Read from replica before reading from primary

Reason:

Let's not it. Superseded by I9cc658d5282b3c and I6f618d396d64

https://gerrit.wikimedia.org/r/691734

so it got green again in May 30 and is now back to red but I assume it's intermittent issues we can't fix. I think as long as it's not consistently red, it's an infrastructure problem that should be handled by QTE (steward of beta cluster).

Change 697787 had a related patch set uploaded (by Zfilipin; author: Zfilipin):

[mediawiki/extensions/WikibaseLexeme@master] selenium: Update wdio-mediawiki

https://gerrit.wikimedia.org/r/697787

so it got green again in May 30 and is now back to red but I assume it's intermittent issues we can't fix.

I've introduced video recording in 697787. Please review and merge if there are no complaints. That should help with debugging.

I think as long as it's not consistently red, it's an infrastructure problem that should be handled by QTE (steward of beta cluster).

Quality-and-Test-Engineering-Team (QTE) is formally steward of the beta cluster, but the team doesn't have anybody that's able to help. So, don't hold your breath.

There might be problems with the beta cluster, but other repositories have a pretty stable daily jobs. See selenium-daily.

That might be a function of the number or tests. Only TwoColConflict and WikibaseLexeme have jobs that are around 20 minutes. All other jobs are just a few minutes.

To move from general comments to action, if anybody wants to pair with me on making WikibaseLexeme job green, let me know. Most of us are in the same time zone, it should be very easy to find the time. (Both e-mail of zeljkof at libera.chat are fine.)

Change 697787 merged by jenkins-bot:

[mediawiki/extensions/WikibaseLexeme@master] selenium: Update wdio-mediawiki

https://gerrit.wikimedia.org/r/697787

The June 6 build had lots of errors, all of them seemingly related to login. That seems to be a one-off.

The previous days, there were far fewer errors, e.g. only three on June 3. One is the wiki becoming read-only due to replication lag, but two others seem to be about replication issues – @hoo’s change added a wait for replication between making a change on the page and then reading from the API, but in the two tests affected here (“Lexeme:Forms has statement list” and “Lexeme:Forms trims whitespace from representation”), it seems to me like the issue is between writing via the API and then reading the change on the page.

Apparently there is a HTTP request header, MediaWiki-Chronology-Client-Id, that can be used to set the ChronologyProtector client ID (originally introduced in T212550). I wonder if setting that header on all our requests (both in the browser and with mwbot) would be enough to make ChronologyProtector work across all our requests? (Currently, I assume browser and API requests get different client IDs – the client ID is derived from the IP address and user agent, and while both requests probably come from the same IP, they won’t have the same user agent.)

Wait, but how are we going to set that header for all the requests made by the browser…

I guess we could get the cpPosIndex cookie from wdio, parse the client ID out of it, and send that as the MediaWiki-Chronology-Client-Id only with the mwbot requests? (Though at that point we might as well pass the whole cookie…)

Change 698517 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/WikibaseLexeme@master] selenium: Use ChronologyProtector cookie in mwbot

https://gerrit.wikimedia.org/r/698517

Change 698517 merged by jenkins-bot:

[mediawiki/extensions/WikibaseLexeme@master] selenium: Use ChronologyProtector cookie in mwbot

https://gerrit.wikimedia.org/r/698517

I mean it's still failing. Can't say if it's the same error or we introduced a new error and it's something else.

Looking at today's failures, 6 of the 7 have the error message

Can't call elementClick on element with selector "#wpLoginAttempt" because element wasn't found

So, I'll focus on that for now.

There might be problems with the beta cluster, but other repositories have a pretty stable daily jobs. See selenium-daily.

But of those jobs, it seems only the WikibaseLexeme tests are running on beta Wikidata, right?

To move from general comments to action, if anybody wants to pair with me on making WikibaseLexeme job green, let me know. Most of us are in the same time zone, it should be very easy to find the time. (Both e-mail of zeljkof at libera.chat are fine.)

I'd be happy to take up that offer in principle, but of the jobs that failed today, 6 of the 7 claim to have failed during the Login action.

Most of the tests do this in the before-hook, so we don't have video or stdout logs. But the "Lexeme:Lemma can be edited multiple times" test is interesting, because there we have stdout logs and video:

Video: https://integration.wikimedia.org/ci/job/selenium-daily-beta-WikibaseLexeme/1049/artifact/log/Lexeme%253ALemma-can-be-edited-multiple-times-2021-06-14T18-10-03-840Z.mp4
Logs: https://integration.wikimedia.org/ci/job/selenium-daily-beta-WikibaseLexeme/1049/testReport/chrome.90_0_4430_212.linux/Lexeme_Lemma/can_be_edited_multiple_times/
Error message: Can't call click on element with selector "#wpLoginAttempt" because element wasn't found

And I'm honestly a bit flabbergast as to what I'm seeing. Contrary to what the error message seems to imply, the Login button actually did get clicked and the Login did succeed.
It just seems that webdriver.io somehow missed the fact that it successfully clicked the button and now tries to keep clicking a button that of course no longer exists on the main page ?!?

I'd be more than happy to hear your take on this.

My only course of action would be to try upgrading to wdio 7, hoping that we might get a bugfix along the way that takes care of this. What version of nodejs are these daily tests running on?

Apologies for the late reply. We have internal virtual conference this week, so I'm busy with that.

But of those jobs, it seems only the WikibaseLexeme tests are running on beta Wikidata, right?

Good point, I've missed that. I didn't think any beta cluster site would be more or less stable than others. Do you think wikidata might be less stable?

https://gerrit.wikimedia.org/r/plugins/gitiles/integration/config/+/master/jjb/mediawiki-extensions.yaml#233

Most of the tests do this in the before-hook, so we don't have video or stdout logs.

Good point, again. I don't think there's a reason not to record actions in hooks. I'll create a task to test that idea, so I don't forget about it.

Contrary to what the error message seems to imply, the Login button actually did get clicked and the Login did succeed.
It just seems that webdriver.io somehow missed the fact that it successfully clicked the button and now tries to keep clicking a button that of course no longer exists on the main page ?!?

That is strange. It's hard to tell what the problem is. Can you reproduce it locally?

My only course of action would be to try upgrading to wdio 7, hoping that we might get a bugfix along the way that takes care of this.

We have an intern (@Sahilgrewalhere) working on this. See T274579. I'll ask him to update the repo.

What version of nodejs are these daily tests running on?

According to Jenkins, node v12.

https://integration.wikimedia.org/ci/view/selenium-daily/job/selenium-daily-beta-WikibaseLexeme/1051/console

+ node --version
v12.21.0

I don't think there's a reason not to record actions in hooks. I'll create a task to test that idea, so I don't forget about it.

T285078: Record videos in hooks.

Apologies for the late reply. We have internal virtual conference this week, so I'm busy with that.

But of those jobs, it seems only the WikibaseLexeme tests are running on beta Wikidata, right?

Good point, I've missed that. I didn't think any beta cluster site would be more or less stable than others. Do you think wikidata might be less stable?

No, I'm just grasping straws here as to what might cause this strange behavior. Maybe somehow some config/functionality might be enabled on beta wikidata that somehow interacts with page unload handlers and causes the click on the button to not register? Or some click handlers on the button itself?

Most of the tests do this in the before-hook, so we don't have video or stdout logs.

Good point, again. I don't think there's a reason not to record actions in hooks. I'll create a task to test that idea, so I don't forget about it.

Thanks, I'm looking forward to seeing those videos. Though I suspect that they might show the same unusual behavior.

Contrary to what the error message seems to imply, the Login button actually did get clicked and the Login did succeed.
It just seems that webdriver.io somehow missed the fact that it successfully clicked the button and now tries to keep clicking a button that of course no longer exists on the main page ?!?

That is strange. It's hard to tell what the problem is. Can you reproduce it locally?

No, locally it is fine, and these tests also run without issues in CI.

My only course of action would be to try upgrading to wdio 7, hoping that we might get a bugfix along the way that takes care of this.

We have an intern (@Sahilgrewalhere) working on this. See T274579. I'll ask him to update the repo.

Thank you. Let's wait and see if the situation changes after these updates have been merged.