Page MenuHomePhabricator

MysqlUpdater::doWatchlistUpdate is very slow
Closed, ResolvedPublic

Description

MysqlUpdater::doWatchlistUpdate takes over two minutes to first check if there are any pages.

Then it seems to fire an update query anyway (To me it looks like as if MediaWiki is creating new broken entries, because on second run it completes quickly) that also takes minutes. This unnecessarily slows down database updates. As both the SELECT and INSERT queries seem to be doing a table scan, the SELECT seems redundant, only doubling the execution time.

SlowTimer [151948ms] at runtime/ext_mysql: slow query: SELECT /* MysqlUpdater::doWatchlistUpdate nike@web1.trans... */  count(*)  FROM `bw_watchlist`    WHERE wl_namespace & 1  LIMIT 1  
Adding missing watchlist talk page rows... SlowTimer [211876ms] at runtime/ext_mysql: slow query: INSERT /* MysqlUpdater::doWatchlistUpdate nike@web1.trans... */ IGNORE INTO `bw_watchlist` (wl_user,wl_namespace,wl_title,wl_notificationtimestamp) SELECT  wl_user,wl_namespace | 1,wl_title,wl_notificationtimestamp FROM `bw_watchlist`    WHERE (NOT (wl_namespace & 1)) 
done.
Adding missing watchlist subject page rows... SlowTimer [117847ms] at runtime/ext_mysql: slow query: INSERT /* MysqlUpdater::doWatchlistUpdate nike@web1.trans... */ IGNORE INTO `bw_watchlist` (wl_user,wl_namespace,wl_title,wl_notificationtimestamp) SELECT  wl_user,wl_namespace & ~1,wl_title,wl_notificationtimestamp FROM `bw_watchlist`    WHERE (wl_namespace & 1) 
done.
MariaDB [translatewiki_net]> explain SELECT /* MysqlUpdater::doWatchlistUpdate nike@web1.trans... */  count(*)  FROM `bw_watchlist`    WHERE wl_namespace & 1  LIMIT 1;
+------+-------------+--------------+-------+---------------+-----------------+---------+------+---------+--------------------------+
| id   | select_type | table        | type  | possible_keys | key             | key_len | ref  | rows    | Extra                    |
+------+-------------+--------------+-------+---------------+-----------------+---------+------+---------+--------------------------+
|    1 | SIMPLE      | bw_watchlist | index | NULL          | namespace_title | 261     | NULL | 4641661 | Using where; Using index |
+------+-------------+--------------+-------+---------------+-----------------+---------+------+---------+--------------------------+
MariaDB [translatewiki_net]> explain INSERT /* MysqlUpdater::doWatchlistUpdate nike@web1.trans... */ IGNORE INTO `bw_watchlist` (wl_user,wl_namespace,wl_title,wl_notificationtimestamp) SELECT  wl_user,wl_namespace | 1,wl_title,wl_notificationtimestamp FROM `bw_watchlist`    WHERE (NOT (wl_namespace & 1)) ;
+------+-------------+--------------+------+---------------+------+---------+------+---------+------------------------------+
| id   | select_type | table        | type | possible_keys | key  | key_len | ref  | rows    | Extra                        |
+------+-------------+--------------+------+---------------+------+---------+------+---------+------------------------------+
|    1 | SIMPLE      | bw_watchlist | ALL  | NULL          | NULL | NULL    | NULL | 4641663 | Using where; Using temporary |
+------+-------------+--------------+------+---------------+------+---------+------+---------+------------------------------+

Event Timeline

Nikerabbit triaged this task as Unbreak Now! priority.Feb 8 2018, 4:00 PM
Nikerabbit added subscribers: Anomie, Legoktm.

tl;dr: I am unable to update translatewiki.net

I am bumping the priority of this. I haven't run update.php in a while, and now it is consistently failing because the insert query is so large that (I suppose) MySQL just simply kills the connection. The insert query doesn't even fit into my scroll buffer in my terminal. Here is a snippet from the end:

h-scope/el',NULL),('22708','1207','Wikipedia-ios-places-search-your-current-location/el',NULL),('22708','1207','Wikipedia-ios-places-show-filter-list-accessibility-label/el',NULL),('22708','1207','Wikipedia-ios-places-title/el',NULL),('22708','1207','Wikipedia-ios-places-unknown-distance/el',NULL),('22708','1207','Wikipedia-ios-potd-description-prefix/el',NULL),('22708','1207','Wikipedia-ios-rank-accessibility-label/el',NULL),('22708','1207','Wikipedia-ios-saved-pages-clear-cancel/el',NULL),('22708','1207','Wikipedia-ios-saved-pages-clear-delete-all/el',NULL),('22708','1207','Wikipedia-ios-saved-title/el',NULL),('22708','1207','Wikipedia-ios-search-result-redirected-from/el',NULL),('22708','1207','Wikipedia-ios-settings-clear-cache-cancel/el',NULL),('22708','1207','Wikipedia-ios-settings-my-languages/el',NULL),('22708','1207','Wikipedia-ios-settings-notifications/el',NULL),('5909','1207','Xtools-error-rate-limit/fr',NULL),('5909','1207','Xtools-error-service-overload/fr',NULL),('17279','9','Codemirror-beta-title/fr',NULL),('13505','9','Smw-property-predefined-errc/ru',NULL),('13505','9','Smw-limitreport-intext-postproctime-value/ru',NULL),('13505','9','Smw-format-datatable-processing/ru',NULL),('13505','9','Smw-format-datatable-zerorecords/ru',NULL),('13505','9','Smw-property-predefined-label-skey/ru',NULL),('13505','9','Smw-processing/ru',NULL),('13505','9','Advancedsearch-optgroup-files/ru',NULL),('13505','9','Advancedsearch-field-filew/ru',NULL),('13505','9','Advancedsearch-field-fileh/ru',NULL),('13505','9','Advancedsearch-filetype-section-image/ru',NULL),('13505','9','AdvancedSearch-namespaces-articles/ru',NULL),('23757','3','Duilio',NULL),('22693','1207','Wikipedia-ios-relative-date-minutes-ago/sr-ec',NULL),('22693','9','Cx-save-draft-save-success/sr-ec',NULL)
Function: MysqlUpdater::doWatchlistUpdate
Error: 2006 MySQL server has gone away (127.0.0.1:3306)

Backtrace:
#0 /srv/mediawiki/workdir/includes/libs/rdbms/database/Database.php(998): Wikimedia\Rdbms\Database->reportQueryError(string, integer, string, string, boolean)
#1 /srv/mediawiki/workdir/includes/libs/rdbms/database/Database.php(1628): Wikimedia\Rdbms\Database->query(string, string)
#2 /srv/mediawiki/workdir/includes/libs/rdbms/database/Database.php(2504): Wikimedia\Rdbms\Database->insert(string, array, string, string)
#3 /srv/mediawiki/workdir/includes/libs/rdbms/database/DatabaseMysqlBase.php(544): Wikimedia\Rdbms\Database->nonNativeInsertSelect(string, string, array, array, string, string, array, array)
#4 /srv/mediawiki/workdir/includes/libs/rdbms/database/Database.php(2450): Wikimedia\Rdbms\DatabaseMysqlBase->nativeInsertSelect(string, string, array, array, string, string, array, array)
#5 /srv/mediawiki/workdir/includes/installer/MysqlUpdater.php(482): Wikimedia\Rdbms\Database->insertSelect(string, string, array, array, string, string)
#6 /srv/mediawiki/workdir/includes/installer/DatabaseUpdater.php(482): MysqlUpdater->doWatchlistUpdate()
#7 /srv/mediawiki/workdir/includes/installer/DatabaseUpdater.php(446): DatabaseUpdater->runUpdates(array, boolean)
#8 /srv/mediawiki/workdir/maintenance/update.php(200): DatabaseUpdater->doUpdates(array)
#9 /srv/mediawiki/workdir/maintenance/doMaintenance.php(102): UpdateMediaWiki->execute()
#10 /srv/mediawiki/workdir/maintenance/update.php(245): include(string)

Pinging @Legoktm and @Anomie because https://gerrit.wikimedia.org/r/c/406308/ might be related.

		$talk = $this->db->selectField( 'watchlist', 'count(*)', 'wl_namespace & 1', __METHOD__ );
		$nontalk = $this->db->selectField(
			'watchlist',
			'count(*)',
			'NOT (wl_namespace & 1)',
			__METHOD__
		);
		if ( $talk == $nontalk ) {
			$this->output( "...watchlist talk page rows already present.\n" );

			return;
		}

What do those queries manually say (ie their numbers)? I'm kinda amused why it thinks it needs to run an update from 1.4 still

Making the queries batched would work around it, but like I say, I want to know why it thinks it needs to insert all those rows....

Pinging @Legoktm and @Anomie because https://gerrit.wikimedia.org/r/c/406308/ might be related.

Only in that it stopped the updater from issuing a query that's not safe for replication.

Change 409120 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/core@master] Database: Add batching to non-native insertSelect()

https://gerrit.wikimedia.org/r/409120

Change 409121 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/core@master] Database: Allow callers to hint that native insertSelect() is safe

https://gerrit.wikimedia.org/r/409121

What do those queries manually say (ie their numbers)? I'm kinda amused why it thinks it needs to run an update from 1.4 still

MariaDB [translatewiki_net]> select count(*) from bw_watchlist where wl_namespace & 1;
+----------+
| count(*) |
+----------+
|  2664452 |
+----------+
1 row in set (1.14 sec)

MariaDB [translatewiki_net]> select count(*) from bw_watchlist where not (wl_namespace & 1);
+----------+
| count(*) |
+----------+
|  2664410 |
+----------+
1 row in set (1.13 sec)

Perhaps more interesting would be the output of

SELECT DISTINCT w1.wl_namespace, w1.wl_title FROM bw_watchlist w1 LEFT JOIN bw_watchlist w2 ON ((w1.wl_namespace ^ 1) = w2.wl_namespace AND w1.wl_title = w2.wl_title) WHERE w2.wl_id IS NULL;

That should tell us which entries, specifically, are missing their pair from the database.

MariaDB [translatewiki_net]> select wl_namespace, count(*) from bw_watchlist group by wl_namespace;
+--------------+----------+
| wl_namespace | count(*) |
+--------------+----------+
...
|         1264 |     6370 | <--------
|         1265 |     6412 | <--------
|         1266 |     2788 |
|         1267 |     2788 |
|         1268 |     9827 |
|         1269 |     9827 |
|         1270 |    10362 |
|         1271 |    10362 |
|         1272 |     6611 |
|         1273 |     6611 |
|         1274 |     7433 |
|         1275 |     7433 |
+--------------+----------+
104 rows in set (1.29 sec)
MariaDB [translatewiki_net]> SELECT DISTINCT w1.wl_namespace, w1.wl_title FROM bw_watchlist w1 LEFT JOIN bw_watchlist w2 ON ((w1.wl_namespace ^ 1) = w2.wl_namespace AND w1.wl_title = w2.wl_title) WHERE w2.wl_id IS NULL;
+--------------+-------------------------------------+
| wl_namespace | wl_title                            |
+--------------+-------------------------------------+
|         1265 | create-addressLabel/bs              |
|         1265 | create-artifactTypeRequest/bs       |
|         1265 | create-browseLabel/bs               |
|         1265 | create-clearManuallySelectedType/bs |
|         1265 | create-createLink/bs                |
|         1265 | create-createText/bs                |
|         1265 | create-createTypeLabel/bs           |
|         1265 | create-fileLabel/bs                 |
|         1265 | create-nameLabel/bs                 |
|         1265 | create-uploadFile/bs                |
|         1265 | details-accessText/bs               |
|         1265 | details-createdText/bs              |
|         1265 | details-creatorText/bs              |
|         1265 | details-edit/bs                     |
|         1265 | details-flash/bs                    |
|         1265 | details-image/bs                    |
|         1265 | details-loadingImage/bs             |
|         1265 | details-modifiedText/bs             |
|         1265 | details-pdf/bs                      |
|         1265 | details-privateAccess/bs            |
|         1265 | details-publicAccess/bs             |
|         1265 | details-referentsLabel/bs           |
|         1265 | details-webpage/bs                  |
|         1265 | editor-dialogCancelLabel/bs         |
|         1265 | editor-dialogDoneBusyLabel/bs       |
|         1265 | editor-dialogDoneLabel/bs           |
|         1265 | editor-mandatoryLabel/bs            |
|         1265 | editor-optionalLabel/bs             |
|         1265 | editor-recommendedLabel/bs          |
|         1265 | list-add/bs                         |
|         1265 | list-comment/bs                     |
|         1265 | list-convert/bs                     |
|         1265 | list-copy/bs                        |
|         1265 | list-cut/bs                         |
|         1265 | list-details/bs                     |
|         1265 | list-edit/bs                        |
|         1265 | list-empty/bs                       |
|         1265 | list-items/bs                       |
|         1265 | list-remove/bs                      |
|         1265 | list-rename/bs                      |
|         1265 | list-rights/bs                      |
|         1265 | list-sortLabel/bs                   |
+--------------+-------------------------------------+
42 rows in set (15.18 sec)

From our config:

# No longer in use
wfAddNamespace( 1264, 'EntryScape' );

e78a5bd77e1c5411b7133d094dd0c43fb8228c26 explains why those titles start with small letter... it seems it was a mistake to remove the capital links setting. It does not, however, explain the missing base pages.

Nikerabbit lowered the priority of this task from Unbreak Now! to Needs Triage.Feb 13 2018, 9:36 AM

I got over the blocker with Anomie's patch.

For now the update.php also runs quickly. Hoping it won't be getting slow again, but who knows what caused it.

Change 409120 merged by jenkins-bot:
[mediawiki/core@master] Database: Add batching to non-native insertSelect()

https://gerrit.wikimedia.org/r/409120

Change 409121 merged by jenkins-bot:
[mediawiki/core@master] rdbms: allow callers to hint that native insertSelect() is safe

https://gerrit.wikimedia.org/r/409121

Is there anything actionable here?

Is there anything actionable here?

Both patches merged; also wondering what's left in this task?

tstarling claimed this task.
tstarling subscribed.

For now the update.php also runs quickly. Hoping it won't be getting slow again, but who knows what caused it.

So, the task as described is presumed fixed. If anyone can reproduce it in the future, it can be reopened.

This has gone fishy again. I ran update.php yesterday, and this morning again and it again taking a quite a bit time in this section.

One example query:

INSERT /* MysqlUpdater::doWatchlistUpdate nike@web1.trans... */ IGNORE INTO bw_watchlist (wl_user,wl_namespace,wl_title,wl_notificationtimestamp) VALUES ('1','0','Annotated_list_of_most_used_messages_for_MediaWiki',NULL),('1','0','CLDR26',NULL),('1','0','FAQ',NULL)

That is clearly my user, and those pages I have not visited or edited in a looong looong time:

What is causing rows to go missing for old watched pages, in less than 24 hours since last update.php run?

Change 499733 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[mediawiki/core@master] [WIP] Tweak MysqlUpdater::doWatchlistUpdate query

https://gerrit.wikimedia.org/r/499733

JTannerWMF subscribed.

This task appears to not have had movement on it for a while, and is not in scope of the Growth-Team's priorities.

Krinkle added a project: Platform Engineering.
Krinkle subscribed.

Tagging CPT for potential CR via their clinic duty (based on Tim's earlier interest).

[mediawiki/core@master] Tweak MysqlUpdater::doWatchlistUpdate query
https://gerrit.wikimedia.org/r/499733

Background: There's likely not much that can be done to speed up the insertion beyond what was already done. A significant part of the problem raised in this task is not that the method is itself slow, but that update.php on translatewiki seems to be finding need to re-run the slow update unexpectedly often.

This probably needs some investigation by those who can reproduce the problem to answer the question asked in T160993#4632886. Based on the information there I doubt the patch in T160993#5065086 will help as it doesn't sound like the potential race is what's actually happening on translatewiki, although the patch could be reviewed for its own sake.

@aaron wrote on Gerrit:

The idea is to avoid locking the whole table via INSERT SELECT and to only insert rows that are needed, rather than trying to INSERT IGNORE every corresponding row for titles found.

It looks like Aaron responded and addressed Brad's CR in Oct 2019. The patch is awaiting re-review.

Change 499733 merged by jenkins-bot:
[mediawiki/core@master] installer: Tweak MysqlUpdater::doWatchlistUpdate query

https://gerrit.wikimedia.org/r/499733

Resolving again following the fix merged in May.