Page MenuHomePhabricator

Error in "User.should be able to create account"/"User.should be able to log in @daily"
Closed, ResolvedPublic

Description

Seen on two MediaWiki-extensions-StopForumSpam patches

00:26:05 [0-2] Error in "User.should be able to create account"
00:26:05 Error: Timeout of 60000ms exceeded. The execution in the test "User should be able to create account" took too long. Try to reduce the run time or increase your timeout for test specs (https://webdriver.io/docs/timeouts). (/workspace/src/tests/selenium/specs/user.js)
00:26:05     at createTimeoutError (/workspace/src/node_modules/mocha/lib/errors.js:498:15)
00:26:05     at Test.Runnable._timeoutError (/workspace/src/node_modules/mocha/lib/runnable.js:431:10)
00:26:05     at Timeout.<anonymous> (/workspace/src/node_modules/mocha/lib/runnable.js:246:24)
00:26:05     at listOnTimeout (internal/timers.js:557:17)
00:26:05     at processTimers (internal/timers.js:500:7)
00:26:06 [0-2] Error in "User.should be able to log in @daily"
00:26:06 Error: invalidjson: No valid JSON response
00:26:06     at /workspace/src/tests/selenium/wdio-mediawiki/node_modules/mwbot/src/index.js:254:31
00:26:06     at tryCatcher (/workspace/src/node_modules/bluebird/js/release/util.js:16:23)
00:26:06     at Promise._settlePromiseFromHandler (/workspace/src/node_modules/bluebird/js/release/promise.js:547:31)
00:26:06     at Promise._settlePromise (/workspace/src/node_modules/bluebird/js/release/promise.js:604:18)
00:26:06     at Promise._settlePromise0 (/workspace/src/node_modules/bluebird/js/release/promise.js:649:10)
00:26:06     at Promise._settlePromises (/workspace/src/node_modules/bluebird/js/release/promise.js:729:18)
00:26:06     at _drainQueueStep (/workspace/src/node_modules/bluebird/js/release/async.js:93:12)
00:26:06     at _drainQueue (/workspace/src/node_modules/bluebird/js/release/async.js:86:9)
00:26:06     at Async._drainQueues (/workspace/src/node_modules/bluebird/js/release/async.js:102:5)
00:26:06     at Immediate.Async.drainQueues [as _onImmediate] (/workspace/src/node_modules/bluebird/js/release/async.js:15:14)
00:26:06     at processImmediate (internal/timers.js:464:21)
00:26:06 [0-2] RETRYING in chrome - /tests/selenium/specs/user.js
00:26:07 [0-2] RUNNING in chrome - /tests/selenium/specs/user.js
00:26:14 [0-2] Error in "User.should be able to create account"
00:26:14 Error: Can't call setValue on element with selector "#wpName2" because element wasn't found
00:26:14     at implicitWait (/workspace/src/node_modules/@wdio/runner/node_modules/webdriverio/build/utils/implicitWait.js:34:19)
00:26:14     at async Element.elementErrorHandlerCallbackFn (/workspace/src/node_modules/@wdio/runner/node_modules/webdriverio/build/middlewares.js:20:29)
00:26:14     at async Element.wrapCommandFn (/workspace/src/node_modules/@wdio/runner/node_modules/@wdio/utils/build/shim.js:131:29)
00:26:14     at async CreateAccountPage.createAccount (/workspace/src/tests/selenium/wdio-mediawiki/CreateAccountPage.js:18:3)
00:26:14     at async Context.<anonymous> (/workspace/src/tests/selenium/specs/user.js:24:3)

Event Timeline

Heh, so I was looking at "Error: Can't call setValue on element with selector '#wpName2' because element wasn't found at implicitWait" and not the more meaningful "User.should be able to create account". SFS does block any non-read action when it's enforcing, including account creations. I would hope that it wasn't doing that in a default/zero config mode, but given these errors selenium-docker is encountering, I suspect that might be the case.

Maybe it's worth defaulting SFSReportOnly to true instead of false?

Maybe it's worth defaulting SFSReportOnly to true instead of false?

Would be easy enough to do and see if this is indeed the problem with selenium-docker's tests. And if it doesn't solve the problem, we can always revert, as I think the current default for SFSReportOnly makes more sense as an out-of-the-box config value.

Maybe it's worth defaulting SFSReportOnly to true instead of false?

Which makes absolutely no apparent difference...

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/StopForumSpam/+/759280

Testing locally, the selenium-docker tests run fine in a default mediawiki master install with mediawiki-docker serving the app. When I install SFS with its default config, on any edit pages, I get a PHP white screen of doom (hence the failing selenium-docker tests for Special:CreateAccount) and the error logs tell me that IPSet is just barely OOM'ing here. So if this is the same issue happening in selenium-docker (I have no other indications to believe that it isn't) then we probably need to slightly bump PHP's memory_limit from the likely default of 128M to maybe 148M or so.

Testing locally, the selenium-docker tests run fine in a default mediawiki master install with mediawiki-docker serving the app. When I install SFS with its default config, on any edit pages, I get a PHP white screen of doom (hence the failing selenium-docker tests for Special:CreateAccount) and the error logs tell me that IPSet is just barely OOM'ing here. So if this is the same issue happening in selenium-docker (I have no other indications to believe that it isn't) then we probably need to slightly bump PHP's memory_limit from the likely default of 128M to maybe 148M or so.

Yes it looks like that is happening in CI https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php72-selenium-docker/99863/artifact/log/mw-error.log :

2022-02-02 17:25:05 3887ac8272d8 wikidb: [2f36808b3472f6ba79658488] /index.php?title=Special%3ACreateAccount   PHP Fatal Error from line 157 of /workspace/src/vendor/wikimedia/ip-set/src/IPSet.php: Allowed memory size of 134217728 bytes exhausted (tried to allocate 20480 bytes)
#0 [internal function]: MWExceptionHandler::handleFatalError()
#1 {main}
2022-02-02 17:25:12 3887ac8272d8 wikidb: [32a7cb95394e2417c1ed0b15] /api.php?format=json   PHP Fatal Error from line 177 of /workspace/src/vendor/wikimedia/ip-set/src/IPSet.php: Allowed memory size of 134217728 bytes exhausted (tried to allocate 20480 bytes)
#0 [internal function]: MWExceptionHandler::handleFatalError()
#1 {main}
2022-02-02 17:25:15 3887ac8272d8 wikidb: [3735dfe5169b63ece1cc7239] /index.php?title=Special%3ACreateAccount   PHP Fatal Error from line 188 of /workspace/src/vendor/wikimedia/ip-set/src/IPSet.php: Allowed memory size of 134217728 bytes exhausted (tried to allocate 20480 bytes)
#0 [internal function]: MWExceptionHandler::handleFatalError()
#1 {main}
2022-02-02 17:25:20 3887ac8272d8 wikidb: [d5e7650bb04d45085f2ca5b4] /api.php?format=json   PHP Fatal Error from line 188 of /workspace/src/vendor/wikimedia/ip-set/src/IPSet.php: Allowed memory size of 134217728 bytes exhausted (tried to allocate 20480 bytes)
#0 [internal function]: MWExceptionHandler::handleFatalError()
#1 {main}

PHP's memory limit in Quibble should be -1, though:

docker run --rm -it --entrypoint="php" docker-registry.wikimedia.org/releng/quibble-buster-php72:1.3.0 "-i" | grep memory_limit
memory_limit => -1 => -1

PHP's memory limit in Quibble should be -1, though:

Is it maybe a hard limit with a hardware/vm allocation then? I don't really know enough about our setup to test/confirm that.

sbassett triaged this task as Medium priority.Feb 7 2022, 8:10 PM

So setting wgSFSIPListLocation to use the 30-day SFS file instead of the 90-day SFS file gets https://gerrit.wikimedia.org/r/759280 to pass CI. Perhaps that's the path of least resistance for now? And we just know that in Wikimedia prod we'll have to set it to the 90-day file in config, as I believe we'd still want to use a higher duration.

sbassett claimed this task.