Page MenuHomePhabricator

Failing test BounceHandler/UnSubscribeUserTest::testUnSubscribeUser due to WANObjectCache::delete
Closed, ResolvedPublicPRODUCTION ERROR

Description

20:34:25 1) UnSubscribeUserTest::testUnSubscribeUser
20:34:25 MWException: CAS update failed on user_touched for user ID '3' (read from slave); the version of the user to be saved is older than the current version.
20:34:25 
20:34:25 /mnt/jenkins-workspace/workspace/mwext-testextension-php55/src/includes/user/User.php:3984
20:34:25 /mnt/jenkins-workspace/workspace/mwext-testextension-php55/src/extensions/BounceHandler/includes/BounceHandlerActions.php:160
20:34:25 /mnt/jenkins-workspace/workspace/mwext-testextension-php55/src/extensions/BounceHandler/includes/BounceHandlerActions.php:86
20:34:25 /mnt/jenkins-workspace/workspace/mwext-testextension-php55/src/extensions/BounceHandler/includes/ProcessBounceEmails.php:91
20:34:25 /mnt/jenkins-workspace/workspace/mwext-testextension-php55/src/extensions/BounceHandler/tests/phpunit/UnSubscribeUserTest.php:58
20:34:25 /mnt/jenkins-workspace/workspace/mwext-testextension-php55/src/tests/phpunit/MediaWikiTestCase.php:389

Event Timeline

Is BounceHandler really to blame here, or do we blame MW?

https://github.com/wikimedia/mediawiki-extensions-BounceHandler/blob/master/includes/BounceHandlerActions.php#L138-L168

It's not like it's doing any of it's own DB queries, so it's MW being crappy

When I locally run it, I get this one:

There was 1 error:

1) UnSubscribeUserTest::testUnSubscribeUser
Declaration of EchoBounceHandlerFormatter::getLinkParams() should be compatible with EchoBasicFormatter::getLinkParams($event, $user, $destination)

/var/www/core/core/extensions/BounceHandler/includes/EchoBounceHandlerFormatter.php:3
/var/www/core/core/includes/AutoLoader.php:81
/var/www/core/core/extensions/Echo/includes/formatters/NotificationFormatter.php:101
/var/www/core/core/extensions/Echo/includes/controller/NotificationController.php:388
/var/www/core/core/extensions/Echo/includes/Notifier.php:101
/var/www/core/core/extensions/Echo/includes/controller/NotificationController.php:264
/var/www/core/core/extensions/Echo/includes/controller/NotificationController.php:104
/var/www/core/core/extensions/Echo/includes/model/Event.php:155
/var/www/core/core/extensions/BounceHandler/includes/BounceHandlerActions.php:107
/var/www/core/core/extensions/BounceHandler/includes/BounceHandlerActions.php:161
/var/www/core/core/extensions/BounceHandler/includes/BounceHandlerActions.php:86
/var/www/core/core/extensions/BounceHandler/includes/ProcessBounceEmails.php:91
/var/www/core/core/extensions/BounceHandler/tests/UnSubscribeUserTest.php:57
/var/www/core/core/tests/phpunit/MediaWikiTestCase.php:390

Well. it was due to my not-updated Echo. Now the error message is the same.

1) UnSubscribeUserTest::testUnSubscribeUser
MWException: CAS update failed on user_touched for user ID '3' (read from slave); the version of the user to be saved is older than the current version.

Change 303345 had a related patch set uploaded (by 01tonythomas):
Check if the user is subscribed initially, before unsubscribing

https://gerrit.wikimedia.org/r/303345

Alright. So I took a look at the tests and the functions involved, and here are my findings.

  1. We are actually sending in 4 bounces and the limit is 3, now for the 4th one, the code at https://github.com/wikimedia/mediawiki-extensions-BounceHandler/blob/master/includes/BounceHandlerActions.php#L86 is actually trying to unsubscribe a user without checking if the user has their email account authenticated. The above patch does the additional checking.
  2. The tests are still failing, as I see that the user is getting unsubscribed in BounceHandlerActions::unSubscribeUser() and the authetnticationTimeStamp is still NULL. The funny thing is, when coming back to the test, trying to replicate this use again in https://github.com/wikimedia/mediawiki-extensions-BounceHandler/blob/master/tests/UnSubscribeUserTest.php -- I see that value of authenticationTimeStamp is set. Something wrong with the test framework, or what ?

Can someone take a look into #2, and help me get the code merged ?

Probably due to b23e0b0a835fa84031442b8b3b313860a6affadf. In hindsight, that change is pretty scary.

Change 325883 had a related patch set uploaded (by Gergő Tisza):
Fix tests

https://gerrit.wikimedia.org/r/325883

The tests are probably still broken; the cause seemed to be WANObjectCache::delete() failing to work properly in unit tests.

Krinkle renamed this task from BounceHandler UnSubscribeUserTest::testUnSubscribeUser test failing to Failing test BounceHandler/UnSubscribeUserTest::testUnSubscribeUser due to WANObjectCache::delete.Apr 8 2017, 12:08 AM
Krinkle moved this task from General to libs/objectcache on the MediaWiki-libs-BagOStuff board.

The tests are probably still broken; the cause seemed to be WANObjectCache::delete() failing to work properly in unit tests.

In what way? Is it the write-hole logic?

I don't remember the details, beyond that User::save did not seem to clear the cache: loading the same user again resulted in the old data. Maybe it was some sort of in-process caching, or consistent reads resulting in pre-update DB data - I can't recall if I looked at those possibilities.

Is there any repro case of delete() not working? Seems more likely to be the use of pcTTL and READ_NORMAL (which is expected not to work). If you want to reload a User and see changes from 1ms ago, you need READ_LATEST. For unit test, clearing the WAN cache process cache via clearProcessCache() can also be an option if needed.

READ_LATEST should not be reading the user from cache at all. Wouldn't that be a bug in the User class if it was?

aaron removed aaron as the assignee of this task.May 5 2017, 10:57 PM
aaron triaged this task as Low priority.May 26 2017, 5:07 PM

Change 420261 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/extensions/BounceHandler@master] Restore broken unit test

https://gerrit.wikimedia.org/r/420261

Change 420261 merged by jenkins-bot:
[mediawiki/extensions/BounceHandler@master] Restore previously broken unit test

https://gerrit.wikimedia.org/r/420261

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:11 PM