Page MenuHomePhabricator

Selenium "User should be able to change preferences" test flaky
Closed, DuplicatePublicPRODUCTION ERROR

Description

Status

🏝 Possible cause: 441191 (mediawiki/core): Browser tests: extract commonly used getTestString function
🏝 Possibly related: T198384: UploadWizard: Selenium tests failing
⚰️ 442343 ( mediawiki/core , abandoned): WIP selenium: recent changes test top revision

💣 #1

mediawiki/core/+/441931
mediawiki-quibble-vendor-mysql-php70-docker/1295

mediawiki-quibble-vendor-mysql-php70-docker/1295
11 passing (44.60s)
1 failing

1) User should be able to change preferences:
'0.6' === '0.6644125369929821-Iñtërnâtiônàlizætiøn☃'
running chrome
AssertionError: '0.6' === '0.6644125369929821-Iñtërnâtiônàlizætiøn☃'
    at Context.<anonymous> (/workspace/src/tests/selenium/specs/user.js:60:10)
    at Promise.F (/workspace/src/node_modules/core-js/library/modules/_export.js:36:28)

💣 #2

mediawiki/core/+/442008
mediawiki-quibble-composer-mysql-php70-docker/1297

mediawiki-quibble-composer-mysql-php70-docker/1297
11 passing (48.50s)
1 failing

1) User should be able to change preferences:
'0.5' === '0.5216016600512554-Iñtërnâtiônàlizætiøn☃'
running chrome
AssertionError: '0.5' === '0.5216016600512554-Iñtërnâtiônàlizætiøn☃'
    at Context.<anonymous> (/workspace/src/tests/selenium/specs/user.js:60:10)
    at Promise.F (/workspace/src/node_modules/core-js/library/modules/_export.js:36:28)

💣 #3

mediawiki/extensions/PageTriage/+/440018
quibble-vendor-mysql-hhvm-docker/6069

quibble-vendor-mysql-hhvm-docker/6069
11 passing (59.40s)
1 failing

1) Special:RecentChanges shows page creation:
'BeforeEach-name-0.9564462824203286-Iñtërnâtiônàlizætiøn☃' === '0.01864498162646111-Iñtërnâtiônàlizætiøn☃'
running chrome
AssertionError: 'BeforeEach-name-0.9564462824203286-Iñtërnâtiônàlizætiøn☃' === '0.01864498162646111-Iñtërnâtiônàlizætiøn☃'
    at Context.<anonymous> (/workspace/src/tests/selenium/specs/specialrecentchanges.js:23:10)
    at Promise.F (/workspace/src/node_modules/core-js/library/modules/_export.js:36:28)

💣 #4

mediawiki/core/+/443138
mediawiki-quibble-composer-mysql-php70-docker/1646

mediawiki-quibble-composer-mysql-php70-docker/1646
2 passing (10s)
1 failing

1) User should be able to change preferences:
'' === '0.018736901285118623-Iñtërnâtiônàlizætiøn'
AssertionError: '' === '0.018736901285118623-Iñtërnâtiônàlizætiøn'
    at Context.<anonymous> (/workspace/src/tests/selenium/specs/user.js:60:10)
    at Promise.F (/workspace/src/node_modules/core-js/library/modules/_export.js:36:28)

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Krinkle triaged this task as Unbreak Now! priority.Jun 25 2018, 7:12 PM
greg added subscribers: zeljkofilipin, greg.

Looks like it was mostly committed by @zeljkofilipin

zeljkofilipin added a subscriber: hashar.

@hashar all three failed jobs are Quibble. Do you have any ideas what went wrong?

This seems related, but not the same as #1 and #2. In those cases failed test is User should be able to change preferences, in this case it's Special:RecentChanges shows page creation.

tests/selenium/specs/user.js:45-61

 it( 'should be able to change preferences', function () {
   var realName = Util.getTestString();

   // create
   browser.call( function () {
     return Api.createAccount( username, password );
   } );

   // log in
   UserLoginPage.login( username, password );

   // change
   PreferencesPage.changeRealName( realName );

   // check
   assert.strictEqual( PreferencesPage.realName.getValue(), realName );
} );

There isn't much in file's history:

$ git log -- tests/selenium/specs/user.js 
commit d96ca0d6a9262045d9a0ea56a866df794eff51dc
Author: Jakob Warkotsch <j.warkotsch@gmail.com>
Date:   Wed Jun 20 13:06:45 2018 +0200

    Browser tests: extract commonly used getTestString function
    
    Extracting this function into a Util module reduces redundancy for
    existing and future tests in core and extensions.
    
    Change-Id: I5c5e9b22af5406f347636b22e68657b2674db6c9

commit 4ffbfabf162dbfe6d510a3c99a557968b944d12b
Author: Ed Sanders <esanders@wikimedia.org>
Date:   Wed Jun 6 11:49:33 2018 +0100

    Selenium tests: Use assert.strictEqual instead of assert.equal
    
    assert.equal is deprecated
    
    Change-Id: I2f83d596db2ad475de5e005baa849782261d433e

commit 16a4d9da7bb810e04ee3e7f3d7f4b0b485a27019
Author: Timo Tijhof <krinklemail@gmail.com>
Date:   Wed May 2 18:48:24 2018 +0100

    selenium: Initial version of wdio-mediawiki package
...

441191 looks like it could have caused the problem. (Merged June 25.)

@Jakob_WMDE, @Pablo-WMDE, @WMDE-leszek, @WMDE-Fisch: you were reviewers of the patch. Do you know if it could cause the problem?

Well, that patch only moved code around, so although could be that it is causing some issues, I cannot think of how it would happen :)

Do we have screenshots/videos from failing runs? #1 and #2 really look suspicious. The code the said patch change could influence the right-hand side of the strictEquals I think. It does look odd that the user name is just 0.5 or 0.6, not the full-blown string.

Do we have screenshots/videos from failing runs? #1 and #2 really look suspicious. The code the said patch change could influence the right-hand side of the strictEquals I think. It does look odd that the user name is just 0.5 or 0.6, not the full-blown string.

#1 mediawiki-quibble-vendor-mysql-php70-docker/1295/should-be-able-to-change-preferences.png

#2 mediawiki-quibble-composer-mysql-php70-docker/1297/should-be-able-to-change-preferences.png

#3 quibble-vendor-mysql-hhvm-docker/6069/shows-page-creation.png

Regarding #3 my working hypothesis is that either the change made in the test is not visible in RecentChanges (is it really possible? maybe edit simple fails?), or there is another edit coming kind of simultaneously that comes before the one made by the RC test. selenium test are not run in parallel or something?

AFAICT 'BeforeEach-name-0.9564462824203286-Iñtërnâtiônàlizætiøn☃' is coming from tests/selenium/specs/page.js, while '0.01864498162646111-Iñtërnâtiônàlizætiøn☃' from tests/selenium/specs/specialrecentchanges.js

#3 quibble-vendor-mysql-hhvm-docker/6069/shows-page-creation.png

So this shows the edit made be special recent changes test is either 1) not done at all, 2) didn't make it to RC yet.
Would slowing down tests help here, or is it the silly idea?

Regarding screen caps #1 and #2, it would be interesting to see if what is send to server is just "0.5" (i.e. save clicked too fast), or is the full long string input and sent (that would mean that server-side does some weird interpreting string as number stuff)?

#3 quibble-vendor-mysql-hhvm-docker/6069/shows-page-creation.png

Assuming that concurrent edits are performed (can this be ruled out?) even slowing down the test at this point (a smell in itself) will not help in guaranteeing that the edit will be the first entry in the list of recent changes. Api.edit() yields newtimestamp (e.g. "2018-06-27T15:39:01Z") that could be used to narrow down the RecentChanges result via ?from but only in one direction and probably with insufficient precision.

Change 442343 had a related patch set uploaded (by Pablo Grass (WMDE); owner: Pablo Grass (WMDE)):
[mediawiki/core@master] WIP selenium: recent changes test top revision

https://gerrit.wikimedia.org/r/442343

I took a look at mediawiki-quibble-vendor-mysql-php70-docker jobs in the last week. There was only one User should be able to change preferences failure, 1295.

@Pablo-WMDE and me tried a bit more to make some sense from the "Special:RecentChanges shows page creation" failure.
If we read MW code correctly, make an edit results in calling RecentChange::notifyEdit, which in turn saves the RC object as a "deferred update".
We guess this is why the problems might be coming from.

Forgive us our lack of fluency in how deferred updates (and possibly job queues - I am still not sure if RC deferred update happens via the job or not) work in MW, but we didn't find a setting to flip that would make CI wiki always put the RC entry into DB before the edit is "finished".
One guess that we also had is that possibly opening any page would cause cleaning up the jobs etc (thus saving the RC entry for the edit test makes through the API), so going then to Special:RecentChanges should make the page already include the entry for the test edit.
But this is a very weak theory, especially that apparently for 90% of cases test works without opening any page between doing the test API edit, and opening Special:RecentChanges. Failure ratio is also small enough for it to be difficult to prove that such a "fix" really helps.

We are quite helpless at this point. I really doubt that changes in https://gerrit.wikimedia.org/r/c/mediawiki/core/+/441191 could really have brought more instability to the test. Not testing crucial functionality like recent changes also does not seem a good idea. Would we gladly help solving this, but we are really out of ideas now.

Change 442343 abandoned by Pablo Grass (WMDE):
WIP selenium: recent changes test top revision

https://gerrit.wikimedia.org/r/442343

zeljkofilipin lowered the priority of this task from Unbreak Now! to High.Jun 28 2018, 11:03 AM

I took a look at mediawiki-quibble-vendor-mysql-php70-docker jobs in the last week. There was only one User should be able to change preferences failure, 1295.

Since the failures are rare (one per job per week, as far as I can see), decreasing priority from UBN to high. @Krinkle if you think this is UBN, let's revert 441191 and do more research in how many failures there are in other jobs.

Added #4, which involves an empty string as the expected value.

Can we please get #3 fixed? We're being forced to bypass Jenkins for every patch we need to merge.

Mooeypoo raised the priority of this task from High to Unbreak Now!.Jul 11 2018, 5:56 PM

I'm changing this to unbreak now, since we can't merge anything into PageTriage. It consistently fails, due to #3. Please help on this.

I took a look at mediawiki-quibble-vendor-mysql-php70-docker jobs in the last week. There was only one User should be able to change preferences failure, 1295.

Since the failures are rare (one per job per week, as far as I can see), decreasing priority from UBN to high. @Krinkle if you think this is UBN, let's revert 441191 and do more research in how many failures there are in other jobs.

^^^

My theory is that the RecentChangesUpdateJob does not have a chance to run by the time the test Special:RecentChanges shows page creation is executed. RecentChanges is thus out of date.

That test looks for the first link in rc to check whether it matches the title that just has been created. So we create a page 0.2524668712533231-Iñtërnâtiônàlizætiøn but it is not shown in the Recent changes:

shows-page-creation.png (899×1 px, 135 KB)

Hence the assertion failure. Working as expected.

That issue is very similar to T191537 which was related to some jobs running unexpectedly between tests thus carrying a state between tests. In short, the jobs are being run on web request and MediaWiki defaults to only run 1 per session ($wgJobRunRate = 1).

There are two ways I can think of to potentially fix it:

  1. from T191537#4267202 add $wgJobRunRate = PHP_INT_MAX; in includes/DevelopmentSettings.php
  2. get the suite to emit an API call to run the jobs, if that is still supported.

Change 445214 had a related patch set uploaded (by Hashar; owner: Hashar):
[mediawiki/extensions/PageTriage@master] (DO NOT SUBMIT) test change for T198137

https://gerrit.wikimedia.org/r/445214

Change 445222 had a related patch set uploaded (by Hashar; owner: Hashar):
[mediawiki/core@master] In development, run all jobs on each requests

https://gerrit.wikimedia.org/r/445222

https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/445222/ sets $wgJobRunRate = PHP_INT_MAX;

https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/PageTriage/+/445214/ depends on that mediawiki/core patch. I triggered a bunch of hhvm and php7 builds. Just one failed. So there is still a race condition related to the job :-(

In short, I have no idea why one of the build fails. Maybe the browser get redirected while the job is still being run in background by the previous request. Ie post-send.

Change 445394 had a related patch set uploaded (by Sbisson; owner: Sbisson):
[mediawiki/core@master] Let 'shows page creation' test be skipped

https://gerrit.wikimedia.org/r/445394

Change 445401 had a related patch set uploaded (by Pablo Grass (WMDE); owner: Pablo Grass (WMDE)):
[mediawiki/core@master] DNM Debug flaky selenium tests for Special:RecentChanges

https://gerrit.wikimedia.org/r/445401

@Pablo-WMDE for #1 and #2 that is User should be able to change preferences: which got solved via T191537 . A previous test enqueue a job for a user Foo, then in the next test for the user Bar, the job for Foo was running and emitting a cookie which logged out the Bar user. That got solved by preventing the jobs from emitting headers / cookies.

Lets follow up on T199446 which is brand new / cleaner.

Change 445394 merged by jenkins-bot:
[mediawiki/core@master] Let 'shows page creation' test be skipped

https://gerrit.wikimedia.org/r/445394

Change 445401 abandoned by Pablo Grass (WMDE):
DNM Debug flaky selenium tests for Special:RecentChanges

https://gerrit.wikimedia.org/r/445401

Change 445214 abandoned by Hashar:
(DO NOT SUBMIT) test change for T198137

https://gerrit.wikimedia.org/r/445214

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:08 PM