Page MenuHomePhabricator

Investigate slow transcludedin query
Closed, ResolvedPublic

Description

https://en.wiktionary.org/w/api.php?action=query&prop=transcludedin&titles=Module%3Alanguages%2Fdata2&tilimit=500 takes at least 30 seconds to complete, which is not acceptable. We need to investigate why is that.

Event Timeline

mobrovac created this task.Sep 8 2016, 2:13 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 8 2016, 2:13 PM

One debug run shows that most time is spent waiting on MySQL to actually return the content. The exact query executed was:

$ curl -v -H 'X-Wikimedia-Debug: backend=mw1099.eqiad.wmnet; profile' -H 'Accept: application/json' 'https://en.wiktionary.org/w/api.php?action=query&prop=transcludedin&titles=Module%3Alanguages%2Fdata2&tilimit=500&format=json&formatversion=2'
Anomie added a subscriber: Anomie.

The query is simple enough:

SELECT tl_from,tl_namespace AS `bl_namespace`,tl_title AS `bl_title`,page_id,page_title,page_namespace,page_is_redirect  FROM `templatelinks` FORCE INDEX (tl_namespace),`page`   WHERE (tl_from = page_id) AND ((tl_namespace = '828' AND tl_title = 'languages/data2'))  ORDER BY tl_from LIMIT 501;

The intention here is for it to pull the relevant entries in order from the templatelinks table, then join with page to get a few fields needed for the output.

Normally, this works fine:

mysql:wikiadmin@db1054 [enwiktionary]> explain SELECT tl_from,tl_namespace AS `bl_namespace`,tl_title AS `bl_title`,page_id,page_title,page_namespace,page_is_redirect  FROM `templatelinks` FORCE INDEX (tl_namespace),`page`   WHERE (tl_from = page_id) AND ((tl_namespace = '828' AND tl_title = 'languages'))  ORDER BY tl_from LIMIT 501;
+------+-------------+---------------+--------+---------------+--------------+---------+------------------------------------+---------+--------------------------+
| id   | select_type | table         | type   | possible_keys | key          | key_len | ref                                | rows    | Extra                    |
+------+-------------+---------------+--------+---------------+--------------+---------+------------------------------------+---------+--------------------------+
|    1 | SIMPLE      | templatelinks | ref    | tl_namespace  | tl_namespace | 261     | const,const                        | 8534620 | Using where; Using index |
|    1 | SIMPLE      | page          | eq_ref | PRIMARY       | PRIMARY      | 4       | enwiktionary.templatelinks.tl_from |       1 |                          |
+------+-------------+---------------+--------+---------------+--------------+---------+------------------------------------+---------+--------------------------+
2 rows in set (0.00 sec)

mysql:wikiadmin@db1054 [enwiktionary]> SELECT tl_from,tl_namespace AS `bl_namespace`,tl_title AS `bl_title`,page_id,page_title,page_namespace,page_is_redirect  FROM `templatelinks` FORCE INDEX (tl_namespace),`page`   WHERE (tl_from = page_id) AND ((tl_namespace = '828' AND tl_title = 'languages'))  ORDER BY tl_from LIMIT 501;
[...]
501 rows in set (0.00 sec)

mysql:wikiadmin@db1054 [enwiktionary]> SHOW SESSION STATUS like 'Hand%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Handler_commit             | 1     |
| Handler_delete             | 0     |
| Handler_discover           | 0     |
| Handler_external_lock      | 0     |
| Handler_icp_attempts       | 0     |
| Handler_icp_match          | 0     |
| Handler_mrr_init           | 0     |
| Handler_mrr_key_refills    | 0     |
| Handler_mrr_rowid_refills  | 0     |
| Handler_prepare            | 0     |
| Handler_read_first         | 0     |
| Handler_read_key           | 502   |
| Handler_read_last          | 0     |
| Handler_read_next          | 500   |
| Handler_read_prev          | 0     |
| Handler_read_rnd           | 0     |
| Handler_read_rnd_deleted   | 0     |
| Handler_read_rnd_next      | 130   |
| Handler_rollback           | 0     |
| Handler_savepoint          | 0     |
| Handler_savepoint_rollback | 0     |
| Handler_tmp_update         | 0     |
| Handler_tmp_write          | 128   |
| Handler_update             | 0     |
| Handler_write              | 0     |
+----------------------------+-------+
25 rows in set (0.00 sec)

But for some reason for this title it's using a much worse plan:

mysql:wikiadmin@db1054 [enwiktionary]> explain SELECT tl_from,tl_namespace AS `bl_namespace`,tl_title AS `bl_title`,page_id,page_title,page_namespace,page_is_redirect  FROM `templatelinks` FORCE INDEX (tl_namespace),`page`   WHERE (tl_from = page_id) AND ((tl_namespace = '828' AND tl_title = 'languages/data2'))  ORDER BY tl_from LIMIT 501;
+------+-------------+---------------+------+---------------+--------------+---------+---------------------------------------+---------+---------------------------------+
| id   | select_type | table         | type | possible_keys | key          | key_len | ref                                   | rows    | Extra                           |
+------+-------------+---------------+------+---------------+--------------+---------+---------------------------------------+---------+---------------------------------+
|    1 | SIMPLE      | page          | ALL  | PRIMARY       | NULL         | NULL    | NULL                                  | 5490747 | Using temporary; Using filesort |
|    1 | SIMPLE      | templatelinks | ref  | tl_namespace  | tl_namespace | 265     | const,const,enwiktionary.page.page_id |       1 | Using where; Using index        |
+------+-------------+---------------+------+---------------+--------------+---------+---------------------------------------+---------+---------------------------------+
2 rows in set (0.00 sec)

mysql:wikiadmin@db1054 [enwiktionary]> SELECT tl_from,tl_namespace AS `bl_namespace`,tl_title AS `bl_title`,page_id,page_title,page_namespace,page_is_redirect  FROM `templatelinks` FORCE INDEX (tl_namespace),`page`   WHERE (tl_from = page_id) AND ((tl_namespace = '828' AND tl_title = 'languages/data2'))  ORDER BY tl_from LIMIT 501;
[...]
501 rows in set (50.69 sec)

mysql:wikiadmin@db1054 [enwiktionary]> SHOW SESSION STATUS like 'Hand%';
+----------------------------+----------+
| Variable_name              | Value    |
+----------------------------+----------+
| Handler_commit             | 1        |
| Handler_delete             | 0        |
| Handler_discover           | 0        |
| Handler_external_lock      | 0        |
| Handler_icp_attempts       | 0        |
| Handler_icp_match          | 0        |
| Handler_mrr_init           | 0        |
| Handler_mrr_key_refills    | 0        |
| Handler_mrr_rowid_refills  | 0        |
| Handler_prepare            | 0        |
| Handler_read_first         | 0        |
| Handler_read_key           | 5375735  |
| Handler_read_last          | 0        |
| Handler_read_next          | 4864453  |
| Handler_read_prev          | 0        |
| Handler_read_rnd           | 0        |
| Handler_read_rnd_deleted   | 0        |
| Handler_read_rnd_next      | 10365694 |
| Handler_rollback           | 0        |
| Handler_savepoint          | 0        |
| Handler_savepoint_rollback | 0        |
| Handler_tmp_update         | 0        |
| Handler_tmp_write          | 4989955  |
| Handler_update             | 0        |
| Handler_write              | 0        |
+----------------------------+----------+
25 rows in set (0.00 sec)

Yes, for some reason it's scanning the whole page table, then using templatelinks as a filter, then filesorting so it can pull off the 501 needed rows.

We can force it to use the right plan with a STRAIGHT_JOIN, and that makes it return with the expected rapidity:

mysql:wikiadmin@db1054 [enwiktionary]> explain SELECT /*! STRAIGHT_JOIN */ tl_from,tl_namespace AS `bl_namespace`,tl_title AS `bl_title`,page_id,page_title,page_namespace,page_is_redirect  FROM `templatelinks` FORCE INDEX (tl_namespace),`page`   WHERE (tl_from = page_id) AND ((tl_namespace = '828' AND tl_title = 'languages/data2'))  ORDER BY tl_from LIMIT 501;
+------+-------------+---------------+--------+---------------+--------------+---------+------------------------------------+---------+--------------------------+
| id   | select_type | table         | type   | possible_keys | key          | key_len | ref                                | rows    | Extra                    |
+------+-------------+---------------+--------+---------------+--------------+---------+------------------------------------+---------+--------------------------+
|    1 | SIMPLE      | templatelinks | ref    | tl_namespace  | tl_namespace | 261     | const,const                        | 9677820 | Using where; Using index |
|    1 | SIMPLE      | page          | eq_ref | PRIMARY       | PRIMARY      | 4       | enwiktionary.templatelinks.tl_from |       1 |                          |
+------+-------------+---------------+--------+---------------+--------------+---------+------------------------------------+---------+--------------------------+
2 rows in set (0.00 sec)

mysql:wikiadmin@db1054 [enwiktionary]> SELECT /*! STRAIGHT_JOIN */ tl_from,tl_namespace AS `bl_namespace`,tl_title AS `bl_title`,page_id,page_title,page_namespace,page_is_redirect  FROM `templatelinks` FORCE INDEX (tl_namespace),`page`   WHERE (tl_from = page_id) AND ((tl_namespace = '828' AND tl_title = 'languages/data2'))  ORDER BY tl_from LIMIT 501;
[...]
501 rows in set (0.00 sec)

mysql:wikiadmin@db1054 [enwiktionary]> SHOW SESSION STATUS like 'Hand%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Handler_commit             | 1     |
| Handler_delete             | 0     |
| Handler_discover           | 0     |
| Handler_external_lock      | 0     |
| Handler_icp_attempts       | 0     |
| Handler_icp_match          | 0     |
| Handler_mrr_init           | 0     |
| Handler_mrr_key_refills    | 0     |
| Handler_mrr_rowid_refills  | 0     |
| Handler_prepare            | 0     |
| Handler_read_first         | 0     |
| Handler_read_key           | 502   |
| Handler_read_last          | 0     |
| Handler_read_next          | 500   |
| Handler_read_prev          | 0     |
| Handler_read_rnd           | 0     |
| Handler_read_rnd_deleted   | 0     |
| Handler_read_rnd_next      | 130   |
| Handler_rollback           | 0     |
| Handler_savepoint          | 0     |
| Handler_savepoint_rollback | 0     |
| Handler_tmp_update         | 0     |
| Handler_tmp_write          | 128   |
| Handler_update             | 0     |
| Handler_write              | 0     |
+----------------------------+-------+
25 rows in set (0.00 sec)

My only guess as to why it's choosing such a bad plan is that it for some reason thinks the good plan will have to scan 9677820 rows when really it needs only 501 to fill the limit. I have no idea whether there's a better fix than STRAIGHT_JOIN to force it, that's a question for our DBAs.

Volans added a comment.Sep 8 2016, 3:06 PM

Just to add some more info, in this specific templatelinks table there are 4.8M rows with those conditions:

MariaDB PRODUCTION s2 localhost enwiktionary > explain select count(*) from templatelinks where tl_namespace = 828 AND tl_title = 'languages/data2';
+------+-------------+---------------+------+---------------+--------------+---------+------------+---------+----------
| id   | select_type | table         | type | possible_keys | key          | key_len | ref         | rows    | Extra                    |
+------+-------------+---------------+------+---------------+--------------+---------+-------------+---------+----
|    1 | SIMPLE      | templatelinks | ref  | tl_namespace  | tl_namespace | 261     | const,const | 9585198 | Using where; Using index |
+------+-------------+---------------+------+---------------+--------------+---------+-------------+---------+----

MariaDB PRODUCTION s2 localhost enwiktionary > select count(*) from templatelinks  where tl_namespace = 828 AND tl_title = 'languages/data2';
+----------+
| count(*) |
+----------+
|  4864454 |
+----------+

And in the page table there are 5.3M rows. I guess this is the reason MySQL decides to ignore the index.

My 2 cents

The problem with the straight join (or hints in general) is that you fix things now and break it elsewhere on other wikis, or for another set of conditions, or with different row distribution in one year. Let's try different aproaches first, such as innodb statistics or MariaDB histograms for finer query plans.

jcrespo moved this task from Triage to In progress on the DBA board.Sep 8 2016, 5:47 PM

Just to add some more info, in this specific templatelinks table there are 4.8M rows with those conditions:

MariaDB PRODUCTION s2 localhost enwiktionary > explain select count(*) from templatelinks where tl_namespace = 828 AND tl_title = 'languages/data2';
+------+-------------+---------------+------+---------------+--------------+---------+------------+---------+----------
| id   | select_type | table         | type | possible_keys | key          | key_len | ref         | rows    | Extra                    |
+------+-------------+---------------+------+---------------+--------------+---------+-------------+---------+----
|    1 | SIMPLE      | templatelinks | ref  | tl_namespace  | tl_namespace | 261     | const,const | 9585198 | Using where; Using index |
+------+-------------+---------------+------+---------------+--------------+---------+-------------+---------+----
MariaDB PRODUCTION s2 localhost enwiktionary > select count(*) from templatelinks  where tl_namespace = 828 AND tl_title = 'languages/data2';
+----------+
| count(*) |
+----------+
|  4864454 |
+----------+

And in the page table there are 5.3M rows. I guess this is the reason MySQL decides to ignore the index.
My 2 cents

I think this is a known bug, that Anomie and me have encountered in the past. Again, let me 1 day to figure out if there is an alternative that does not involve an index hint.

Volans added a comment.Sep 8 2016, 6:26 PM

@jcrespo out of curiosity and learning, is this related to the fact that templatelinks is partitioned? Also if the partition key is fixed here being tl_namespace?

Anomie added a comment.Sep 8 2016, 7:37 PM

@jcrespo out of curiosity and learning, is this related to the fact that templatelinks is partitioned? Also if the partition key is fixed here being tl_namespace?

It does not appear to be related, since I see the same behavior on e.g. db1074 which has no partitioning.

Anomie moved this task from Unsorted to Non-Code on the MediaWiki-API board.Sep 9 2016, 1:23 PM
jcrespo claimed this task.Sep 13 2016, 11:15 AM

I regenerated the table statistics without sucess, I will try to tune the query planner next forcing the usage of histograms.

MariaDB  db1024.eqiad.wmnet enwiktionary > ANALYZE TABLE page;
+-------------------+---------+----------+-----------------------------------------+
| Table             | Op      | Msg_type | Msg_text                                |
+-------------------+---------+----------+-----------------------------------------+
| enwiktionary.page | analyze | status   | Engine-independent statistics collected |
| enwiktionary.page | analyze | status   | OK                                      |
+-------------------+---------+----------+-----------------------------------------+
2 rows in set (4 min 22.95 sec)

MariaDB  db1024.eqiad.wmnet enwiktionary > ANALYZE TABLE templatelinks;
+----------------------------+---------+----------+-----------------------------------------+
| Table                      | Op      | Msg_type | Msg_text                                |
+----------------------------+---------+----------+-----------------------------------------+
| enwiktionary.templatelinks | analyze | status   | Engine-independent statistics collected |
| enwiktionary.templatelinks | analyze | status   | OK                                      |
+----------------------------+---------+----------+-----------------------------------------+
2 rows in set (42 min 3.26 sec)

I tried it, the only thing I took from here is that MariaDB -at least the version we use- is dumb, and I am 99% convinced this issue would be fixed on MySQL 5.7. Forcing or ignoring histograms, or changing optimizer_switch parameters did not work.

The reason being that if you run explain extended, filtered does not move away from 100%:

MariaDB  db1024.eqiad.wmnet enwiktionary > EXPLAIN EXTENDED SELECT tl_from,tl_namespace AS `bl_namespace`,tl_title AS `bl_title`,page_id,page_title,page_namespace,page_is_redirect  FROM `templatelinks` FORCE INDEX (tl_namespace),`page`   WHERE (tl_from = page_id) AND ((tl_namespace = '828' AND tl_title = 'languages/data2'))  ORDER BY tl_from LIMIT 501\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: page
         type: ALL
possible_keys: PRIMARY
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 5250137
     filtered: 100.00
        Extra: Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: templatelinks
         type: ref
possible_keys: tl_namespace
          key: tl_namespace
      key_len: 265
          ref: const,const,enwiktionary.page.page_id
         rows: 1
     filtered: 100.00
        Extra: Using where; Using index
2 rows in set, 1 warning (0.02 sec)

Let's go with the straight_join solution, and I will setup a place to document this hack until we upgrade to a later version.

jcrespo removed jcrespo as the assignee of this task.EditedSep 14 2016, 12:00 PM

This is a one line patch, I can do it if you have faith in me; I personally don't. I would like to focus on the documentation part, though.

One more optimizer hint will not hurt:

1jynus@seul:~/core$ git grep 'USE INDEX' | cat
2HISTORY:* (bug 3056) MySQL 3 compatibility fix: USE INDEX instead of FORCE INDEX
3HISTORY:* Specify USE INDEX on Allpages chunk queries, sometimes gets lost
4HISTORY:* (bug 3056) MySQL 3 compatibility fix: USE INDEX instead of FORCE INDEX
5includes/CategoryViewer.php: 'USE INDEX' => [ 'categorylinks' => 'cl_sortkey' ],
6includes/Linker.php: 'USE INDEX' => [ 'revision' => 'page_timestamp' ],
7includes/actions/HistoryAction.php: 'USE INDEX' => 'page_timestamp', 'LIMIT' => $limit ]
8includes/actions/HistoryAction.php: 'options' => [ 'USE INDEX' => [ 'revision' => 'page_timestamp' ] ],
9includes/api/ApiQueryAllImages.php: $this->addOption( 'USE INDEX', [ 'image' => 'img_usertext_timestamp' ] );
10includes/api/ApiQueryAllImages.php: $this->addOption( 'USE INDEX', [ 'image' => 'img_timestamp' ] );
11includes/api/ApiQueryAllLinks.php: $this->addOption( 'USE INDEX', $this->useIndex );
12includes/api/ApiQueryAllPages.php: $this->addOption( 'USE INDEX', 'name_title' );
13includes/api/ApiQueryBacklinks.php: $this->addOption( 'USE INDEX', [ 'page' => 'PRIMARY' ] );
14includes/api/ApiQueryBacklinksprop.php: $this->addOption( 'USE INDEX', [ $settings['linktable'] => $idxWithFromNS ] );
15includes/api/ApiQueryBacklinksprop.php: $this->addOption( 'USE INDEX', [ $settings['linktable'] => $idxNoFromNS ] );
16includes/api/ApiQueryBase.php: * Add an option such as LIMIT or USE INDEX. If an option was set
17includes/api/ApiQueryCategoryMembers.php: $this->addTables( [ 'page', 'categorylinks' ] ); // must be in this order for 'USE INDEX'
18includes/api/ApiQueryCategoryMembers.php: $this->addOption( 'USE INDEX', 'cl_timestamp' );
19includes/api/ApiQueryCategoryMembers.php: $this->addOption( 'USE INDEX', 'cl_sortkey' );
20includes/api/ApiQueryDeletedrevs.php: 'USE INDEX',
21includes/api/ApiQueryExtLinksUsage.php: $this->addTables( [ 'page', 'externallinks' ] ); // must be in this order for 'USE INDEX'
22includes/api/ApiQueryExtLinksUsage.php: $this->addOption( 'USE INDEX', 'el_index' );
23includes/api/ApiQueryLinks.php: $this->addOption( 'USE INDEX', $this->prefix . '_from' );
24includes/api/ApiQueryUserContributions.php: $this->addOption( 'USE INDEX', $index );
25includes/db/Database.php: * Returns an optional USE INDEX clause to go after the table, and a
26includes/db/Database.php: if ( isset( $options['USE INDEX'] ) && is_string( $options['USE INDEX'] ) ) {
27includes/db/Database.php: $useIndex = $this->useIndexClause( $options['USE INDEX'] );
28includes/db/Database.php: $useIndexes = ( isset( $options['USE INDEX'] ) && is_array( $options['USE INDEX'] ) )
29includes/db/Database.php: ? $options['USE INDEX']
30includes/db/Database.php: * which might have a JOIN and/or USE INDEX or IGNORE INDEX clause
31includes/db/Database.php: if ( isset( $use_index[$alias] ) ) { // has USE INDEX?
32includes/db/Database.php: * USE INDEX clause. Unlikely to be useful for anything but MySQL. This
33includes/db/DatabaseOracle.php: * Returns an optional USE INDEX clause to go after the table, and a
34includes/db/DatabaseOracle.php: if ( isset( $options['USE INDEX'] ) && !is_array( $options['USE INDEX'] ) ) {
35includes/db/DatabaseOracle.php: $useIndex = $this->useIndexClause( $options['USE INDEX'] );
36includes/db/IDatabase.php: * - USE INDEX: This may be either a string giving the index name to use
37includes/export/WikiExporter.php: [ 'ORDER BY' => 'log_id', 'USE INDEX' => [ 'logging' => 'PRIMARY' ] ]
38includes/export/WikiExporter.php: $opts['USE INDEX'] = [];
39includes/export/WikiExporter.php: $opts['USE INDEX']['revision'] = 'rev_page_id';
40includes/export/WikiExporter.php: $opts['USE INDEX']['page'] = 'PRIMARY';
41includes/filerepo/file/LocalFile.php: $opts['USE INDEX'] = [ 'oldimage' => 'oi_name_timestamp' ];
42includes/logging/LogPager.php: $options['USE INDEX'] = $index;
43includes/page/Article.php: [ 'USE INDEX' => 'rc_timestamp' ]
44includes/page/WikiPage.php: [ 'USE INDEX' => 'page_timestamp',
45includes/revisiondelete/RevDelRevisionList.php: 'USE INDEX' => [ 'revision' => 'PRIMARY' ] // workaround for MySQL bug (T104313)
46includes/specials/SpecialAllPages.php: 'USE INDEX' => 'name_title',
47includes/specials/SpecialLinkSearch.php: 'options' => [ 'USE INDEX' => $clause ]
48includes/specials/SpecialPrefixindex.php: 'USE INDEX' => 'name_title',
49includes/specials/SpecialShortpages.php: $options = [ 'USE INDEX' => [ 'page' => 'page_redirect_namespace_len' ] ];
50includes/specials/pagers/AllMessagesTablePager.php: [ 'USE INDEX' => 'name_title' ]
51includes/specials/pagers/CategoryPager.php: 'options' => [ 'USE INDEX' => 'cat_title' ],
52includes/specials/pagers/ContribsPager.php: $options['USE INDEX'] = [ 'revision' => $index ];
53includes/specials/pagers/DeletedContribsPager.php: 'options' => [ 'USE INDEX' => $index ]
54includes/specials/pagers/NewPagesPager.php: $options = [ 'USE INDEX' => [ 'recentchanges' => $rcIndexes ] ];
55maintenance/rebuildFileCache.php: [ 'ORDER BY' => 'page_id ASC', 'USE INDEX' => 'PRIMARY' ]
56jynus@seul:~/core$ git grep 'STRAIGHT' | cat
57includes/api/ApiQueryAllPages.php: $this->addOption( 'STRAIGHT_JOIN' );
58includes/api/ApiQueryAllRevisions.php: $this->addOption( 'STRAIGHT_JOIN' );
59includes/api/ApiQueryAllRevisions.php: $this->addOption( 'STRAIGHT_JOIN' );
60includes/api/ApiQueryBacklinks.php: $this->addOption( 'STRAIGHT_JOIN' );
61includes/api/ApiQueryCategories.php: $this->addOption( 'STRAIGHT_JOIN' );
62includes/api/ApiQueryUserContributions.php: // STRAIGHT_JOIN
63includes/api/ApiQueryUserContributions.php: $this->addOption( 'STRAIGHT_JOIN' );
64includes/cache/BacklinkCache.php: array_merge( [ 'STRAIGHT_JOIN' ], $options )
65includes/db/Database.php: if ( isset( $noKeyOptions['STRAIGHT_JOIN'] ) ) {
66includes/db/Database.php: $startOpts .= ' /*! STRAIGHT_JOIN */';
67includes/db/IDatabase.php: * - STRAIGHT_JOIN
68includes/export/WikiExporter.php: $opts[] = 'STRAIGHT_JOIN';
69includes/export/WikiExporter.php: $opts[] = 'STRAIGHT_JOIN';
70includes/specials/SpecialWhatlinkshere.php: [ 'ORDER BY' => $fromCol, 'LIMIT' => 2 * $queryLimit, 'STRAIGHT_JOIN' ],
71includes/specials/pagers/NewFilesPager.php: $options[] = 'STRAIGHT_JOIN';
72maintenance/updateCollation.php: 'STRAIGHT_JOIN' // per T58041
73jynus@seul:~/core$ git grep 'FORCE INDEX' | cat
74HISTORY:* (bug 3056) MySQL 3 compatibility fix: USE INDEX instead of FORCE INDEX
75HISTORY:* (bug 3056) MySQL 3 compatibility fix: USE INDEX instead of FORCE INDEX
76includes/db/DatabaseMysqlBase.php: return "FORCE INDEX (" . $this->indexName( $index ) . ")";
77maintenance/postgres/compare_schemas.pl: if (/FORCE INDEX/ and $file !~ /Database\w*\.php/) {
78maintenance/postgres/compare_schemas.pl: warn "Found FORCE INDEX string at line $. of $file\n";
79jynus@seul:~/core$ git grep 'IGNORE INDEX' | cat
80includes/db/Database.php: if ( isset( $options['IGNORE INDEX'] ) && is_string( $options['IGNORE INDEX'] ) ) {
81includes/db/Database.php: $ignoreIndex = $this->ignoreIndexClause( $options['IGNORE INDEX'] );
82includes/db/Database.php: $ignoreIndexes = ( isset( $options['IGNORE INDEX'] ) && is_array( $options['IGNORE INDEX'] ) )
83includes/db/Database.php: ? $options['IGNORE INDEX']
84includes/db/Database.php: * which might have a JOIN and/or USE INDEX or IGNORE INDEX clause
85includes/db/Database.php: if ( isset( $ignore_index[$alias] ) ) { // has IGNORE INDEX?
86includes/db/Database.php: * IGNORE INDEX clause. Unlikely to be useful for anything but MySQL. This
87includes/db/DatabaseMysqlBase.php: return "IGNORE INDEX (" . $this->indexName( $index ) . ")";
88includes/db/DatabaseOracle.php: if ( isset( $options['IGNORE INDEX'] ) && !is_array( $options['IGNORE INDEX'] ) ) {
89includes/db/DatabaseOracle.php: $ignoreIndex = $this->ignoreIndexClause( $options['IGNORE INDEX'] );

Since ApiQueryBacklinks is using a straight join, it might be safe enough to do it for templatelinks as well?

Since ApiQueryBacklinks is using a straight join, it might be safe enough to do it for templatelinks as well?

Yes.

Anomie claimed this task.Sep 14 2016, 2:05 PM

Thanks for looking into alternative solutions, @jcrespo. I'll write the patch for straight_join.

I am sorry, Anomie, for this and all other issues. We are already looking at next versions of the server that will fix this and other issues (like the large IN values).

Change 310552 had a related patch set uploaded (by Anomie):
API: Force straight join for prop=linkshere|transcludedin|fileusage

https://gerrit.wikimedia.org/r/310552

Anomie moved this task from Non-Code to Needs Review on the MediaWiki-API board.Sep 14 2016, 2:21 PM

Change 310552 merged by jenkins-bot:
API: Force straight join for prop=linkshere|transcludedin|fileusage

https://gerrit.wikimedia.org/r/310552

Anomie closed this task as Resolved.Sep 16 2016, 1:33 PM

Investigation complete, and a fix applied.

Thank you, @jcrespo and @Anomie for the detailed investigation and fix. @Anomie, would you be ok with me SWATting the fix onto wmf.19 on monday?

I don't have any problem with you SWATting it, although I note that at the moment wmf.19 has been rolled back due to T145819.

Change 311405 had a related patch set uploaded (by Mobrovac):
API: Force straight join for prop=linkshere|transcludedin|fileusage

https://gerrit.wikimedia.org/r/311405

Change 311406 had a related patch set uploaded (by Mobrovac):
API: Force straight join for prop=linkshere|transcludedin|fileusage

https://gerrit.wikimedia.org/r/311406

Change 311406 merged by jenkins-bot:
API: Force straight join for prop=linkshere|transcludedin|fileusage

https://gerrit.wikimedia.org/r/311406

Change 311405 merged by jenkins-bot:
API: Force straight join for prop=linkshere|transcludedin|fileusage

https://gerrit.wikimedia.org/r/311405

Mentioned in SAL (#wikimedia-operations) [2016-09-19T13:18:03Z] <hashar@tin> Synchronized php-1.28.0-wmf.18/includes/api/ApiQueryBacklinksprop.php: API: Force straight join for prop=linkshere|transcludedin|fileusage T145079 (duration: 00m 50s)

Mentioned in SAL (#wikimedia-operations) [2016-09-19T13:18:57Z] <hashar@tin> Synchronized php-1.28.0-wmf.19/includes/api/ApiQueryBacklinksprop.php: API: Force straight join for prop=linkshere|transcludedin|fileusage T145079 (duration: 00m 47s)

hashar added a subscriber: hashar.Sep 19 2016, 1:19 PM

Deployed on current wmf.18 as well as next wmf.19.