Page MenuHomePhabricator

[INVESTIGATION] Investigate flaky Mismatch Finder browser tests
Closed, ResolvedPublic1 Estimated Story Points

Description

Mismatch Finder tests fail randomly on CI very often. These errors are not consistent, since sometimes they pass, and sometimes they fail.

The following error that appears to be a timeout error keeps reappearing in different tests (example 1, 2, 3, 4):

Actual path [/_dusk/login/2] does not equal expected path [/results].
Failed asserting that '/_dusk/login/2' matches PCRE pattern "/^\/results$/u".

Event Timeline

ItamarWMDE renamed this task from Fix flacky Mismatch Finder browser tests to Fix flaky Mismatch Finder browser tests.Jul 1 2022, 11:21 AM
ItamarWMDE renamed this task from Fix flaky Mismatch Finder browser tests to [INVESTIGATION] Investigate flaky Mismatch Finder browser tests.EditedJul 1 2022, 11:23 AM
ItamarWMDE updated the task description. (Show Details)

Prio Notes:

  • Affects development efforts
  • Doesn't affect external stakeholders
  • Doesn't affect end users
  • Doesn't affect onboarding efforts

from backlog refinement: timebox is required (1/2 to 1 day)

The listed examples have in common that they fail when visiting a page after logging in the user:

$browser->loginAs(User::factory()->create())
                ->visit(new ResultsPage($mismatch->item_id))

It seems like the redirect is not happening at the time visit is called or is not happening at all. The screenshots of the failed test show a blank page.

In all the examples about using loginAs the user is always created outside of the function. I don't have a logical way of proving why this might be the cause, maybe the database operation that creates the user isn't finished by the time loginAs is called occasionally in CI?
If this is the case then changing create() to make() if we still want to call the factory from inside the loginAs function could possibly solve the issue.

From reading the docs I gather that create() is used when we need to use the data after it is being created. Which is not the way we are using it in these tests.

Update: using make() doesn't persist the user until the end of the test, so we need to stick to create().

I am creating this test to see if it might be the reason why the tests fail sometimes. https://github.com/wmde/wikidata-mismatch-finder/pull/440


This seems to be relevant: https://laravel.com/docs/8.x/dusk#authentication

After using the loginAs method, the user session will be maintained for all tests within the file.


There is also this github issue that is similar to the issue we are having, the suggestion here is to check the SESSION_DOMAIN env variable: https://github.com/laravel/dusk/issues/408

Though I don't see how this could be affecting our tests in particular, because they are failing inconsistently.


Another possibility is that there is a database issue because we are using create() instead of make() to create the users.. Why do we need to store the user in the database if we are using DatabaseMigrations and a different db is used in every test? The laravel documentation is not very clear about this.

Update: we need to use create() instead of make() because loginAs checks the database for the user, and make() doesn't create an entry in the database, which causes the tests to fail.

https://laracasts.com/discuss/channels/laravel/please-explain-dusk-databasemigrations-like-im-a-dummy

After having the PoC PR merged for a bit and adding more PRs to the project, we can see that the tests are still failing randomly with the same issue, so we have to do another round of investigation. Example 1, 2, 3

Unassigning myself, someone else might have a better idea.

I think We probably exceeded the original timebox for this, no?

Not yet :D but if we continue on it yes.

https://github.com/wmde/wikidata-mismatch-finder/pull/459 was merged as a possible way to improve the situation. We will have to wait and see if this changes anything

Unfortunately, this seems to not have helped. Looking at the list of failed Test workflow runs on GitHub Actions, it seems like 42 failures occurred last month due to browser test (Although I didn't take the time to confirm each one of them was due to the same flakiness error). As the allotted timebox for this investigation is exceeded, I suggest we shelve this issue for now, and get back to it at a later date.

Unfortunately, this seems to not have helped. Looking at the list of failed Test workflow runs on GitHub Actions, it seems like 42 failures occurred last month due to browser test (Although I didn't take the time to confirm each one of them was due to the same flakiness error). As the allotted timebox for this investigation is exceeded, I suggest we shelve this issue for now, and get back to it at a later date.

Mh, maybe I'm doing something wrong, but when looking at CI actions that ran for the main branch (and thus exclude genuine bugs in PRs and unfinished PRs and such), then I see only four failed CI runs since my PR was merged: https://github.com/wmde/wikidata-mismatch-finder/actions?query=is%3Afailure+branch%3Amain

Three of them failed because of problems with toolforge and only one failed due to a browser test being flaky. That one flaky browser test is not the one I modified in #459 (ResultsTest.php), but AppTest.php.

Still, I agree with resolving this task and looking into the remaining flakyness another time.

Well the falkyness didn't happen only on main, I looked at the total of failed runs for the Test workflow, and sampled them to see what failed (predominantly - browser tests)