Page MenuHomePhabricator

Maintenance script appservers running code from wmf.20 when version not live
Closed, DeclinedPublicPRODUCTION ERROR

Description

Error
normalized_message
[{reqId}] {exception_url}   PHP Fatal Error from line 222 of /srv/mediawiki/php-1.41.0-wmf.20/includes/AutoLoader.php: require(): Failed opening required '/srv/mediawiki/php-1.41.0-wmf.20/includes/libs/rdbms/exception/DBConnectionError.php' (include_path=
exception.trace
from /srv/mediawiki/php-1.41.0-wmf.20/includes/AutoLoader.php(222)
#0 [internal function]: MWExceptionHandler::handleFatalError()
#1 {main}
Impact
Notes

Event Timeline

grep on mwmaint1002 for php, looking for long running stuff, gives me only

Jul11   0:00 /bin/bash /usr/local/bin/mwscript eval.php --wiki=commonswiki

The others are all Aug 22 or 23rd just fyi.

I was happy to find an item in a log entry from 2023-08-22T10:34:21.414Z that shows what script invocation triggered this error:

"cli_argv": "/srv/mediawiki-staging/multiversion/MWScript.php extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki=viwiki --current --all",

There are two similar processes running at this moment on mwmaint1002 that look to have been started at 2023-08-23T00:00 by @Urbanecm (personal crontab maybe?).

Urbanecm added a subscriber: matmarex.

I've started both scripts manually as part of T315510: Start maintenance script to backfill talk page comment database. This script is a long-running one, but is was manually started and its activity is supervised by myself (developer point of contact is @matmarex). I think this error happened, because promoting MW version dropped the wmf.20 files (which was the version in which the script was running, given its duration). Once the wmf.20 files were dropped, the script was no longer able to load the class again. Once I restarted the script, it continued in its job with no issues at all.

I don't think there is any reason to consider this a train blocker, or even a bug. It is far from ideal to have long-running scripts running at mwmaint, and this task proves that fact. If @matmarex has any thoughts on making the script shorter (to not run more than a couple of days, to avoid running into dropping source code when new version is deployed), that'd be great, but since it is a manual and one-off acitvity, I don't think it needs to block the train in any way in particular. If this rewrite is reasonably easy to do, I think we should track it it in a separate task.

I'm boldly closing this task as Declined / Wontfix, and I hope the remainder of the script will finish soon. If anyone disagrees, feel free to reopen and we can think further about this one.

Thanks for the ping on this one!

There are two similar processes running at this moment on mwmaint1002 that look to have been started at 2023-08-23T00:00 by @Urbanecm (personal crontab maybe?).

I was misreading the ps output. Here is better info on the currently running scripts:

$ ps ax -o user,pid,%cpu,%mem,vsz,rss,tty,stat,lstart,cmd |grep "[D]iscussionTools"
urbanecm  8250  0.0  0.0   6772  3312 pts/6    S+   Wed Aug 23 07:24:48 2023 /bin/bash /usr/local/bin/mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki=enwiki --current --all --start ["36498379"]
root      8254  0.0  0.0  10192  4052 pts/6    S+   Wed Aug 23 07:24:48 2023 sudo -u www-data php /srv/mediawiki-staging/multiversion/MWScript.php extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki=enwiki --current --all --start ["36498379"]
www-data  8255 60.9  1.9 1727092 1287732 pts/6 Rl+  Wed Aug 23 07:24:48 2023 php /srv/mediawiki-staging/multiversion/MWScript.php extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki=enwiki --current --all --start ["36498379"]
urbanecm  8825  0.0  0.0   6772  3372 pts/5    S+   Wed Aug 23 07:25:02 2023 /bin/bash /usr/local/bin/mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki=viwiki --current --all --start ["13883909"]
root      8829  0.0  0.0  10192  4024 pts/5    S+   Wed Aug 23 07:25:02 2023 sudo -u www-data php /srv/mediawiki-staging/multiversion/MWScript.php extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki=viwiki --current --all --start ["13883909"]
www-data  8830 77.9  1.3 1252540 908752 pts/5  Rl+  Wed Aug 23 07:25:02 2023 php /srv/mediawiki-staging/multiversion/MWScript.php extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki=viwiki --current --all --start ["13883909"]

I have filed the same stacktrace with a background job T348614. The Autoloader fails to find /srv/mediawiki/php-1.41.0-wmf.29/includes/libs/rdbms/exception/DBConnectionError.php even though it is present on disk (1.41.0-wmf.29 is still on group2).