plwiki API request is excessively slow when including a badrevid
Open, Needs TriagePublic

Description

The paste below is a demo of a few requests to the API from within the prod cluster. Note that when the badrevid of 21914234230 is included in the request, it takes ~20 seconds to get a response, but requests without it take less than 1/10th of a second

halfak@scb1001:/srv/log/ores$ time wget 'https://pl.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=ids|user|timestamp|userid|comment|size|contentmodel|content&revids=21914234230|3243242|234324' -qO- > /dev/null

real	0m21.043s
user	0m0.016s
sys	0m0.000s
halfak@scb1001:/srv/log/ores$ time wget 'https://pl.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=ids|user|timestamp|userid|comment|size|contentmodel|content&revids=3243242|234324' -qO- > /dev/null

real	0m0.086s
user	0m0.024s
sys	0m0.000s
halfak@scb1001:/srv/log/ores$ time wget 'https://pl.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=ids|user|timestamp|userid|comment|size|contentmodel|content&revids=21914234230|3243242' -qO- > /dev/null

real	0m21.520s
user	0m0.024s
sys	0m0.000s
halfak@scb1001:/srv/log/ores$ time wget 'https://pl.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=ids|user|timestamp|userid|comment|size|contentmodel|content&revids=21914234230|234324' -qO- > /dev/null

real	0m22.002s
user	0m0.024s
sys	0m0.000s
Halfak created this task.Jul 13 2016, 8:35 PM
Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptJul 13 2016, 8:35 PM

I just tried a similar query against enwiki. https://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=ids|user|timestamp|userid|comment|size|contentmodel|content&revids=21914234230|3243242|234324

I got:

{
    "servedby": "mw1277",
    "error": {
        "code": "internal_api_error_DBQueryError",
        "info": "[V4amHQpAAEgAAGpfqmwAAAAO] Database query error"
    }
}
Reedy added a subscriber: Reedy.Jul 13 2016, 8:44 PM
	ApiPageSet::initFromRevIDs	10.64.48.20	2013	Lost connection to MySQL server during query (10.64.48.20)	SELECT  rev_id,rev_page  FROM `revision`,`page`   WHERE rev_id IN ('21914234230','3243242','234324')  AND (rev_page = page_id)
Reedy added a comment.Jul 13 2016, 8:47 PM
mysql:wikiadmin@db1080 [enwiki]> explain SELECT  rev_id,rev_page  FROM `revision`,`page`   WHERE rev_id IN ('21914234230','3243242','234324')  AND (rev_page = page_id);
+------+-------------+----------+-------+---------------------------------------------------+----------------+---------+---------------------+----------+--------------------------+
| id   | select_type | table    | type  | possible_keys                                     | key            | key_len | ref                 | rows     | Extra                    |
+------+-------------+----------+-------+---------------------------------------------------+----------------+---------+---------------------+----------+--------------------------+
|    1 | SIMPLE      | page     | index | PRIMARY                                           | page_len       | 4       | NULL                | 37701621 | Using index              |
|    1 | SIMPLE      | revision | ref   | PRIMARY,rev_id,page_timestamp,page_user_timestamp | page_timestamp | 4       | enwiki.page.page_id |        9 | Using where; Using index |
+------+-------------+----------+-------+---------------------------------------------------+----------------+---------+---------------------+----------+--------------------------+
2 rows in set (0.00 sec)

mysql:wikiadmin@db1080 [enwiki]> explain SELECT  rev_id,rev_page  FROM `revision`,`page`   WHERE rev_id IN (21914234230,3243242,234324)  AND (rev_page = page_id);
+------+-------------+----------+--------+---------------------------------------------------+---------+---------+--------------------------+------+--------------------------+
| id   | select_type | table    | type   | possible_keys                                     | key     | key_len | ref                      | rows | Extra                    |
+------+-------------+----------+--------+---------------------------------------------------+---------+---------+--------------------------+------+--------------------------+
|    1 | SIMPLE      | revision | range  | PRIMARY,rev_id,page_timestamp,page_user_timestamp | rev_id  | 4       | NULL                     |    3 | Using where; Using index |
|    1 | SIMPLE      | page     | eq_ref | PRIMARY                                           | PRIMARY | 4       | enwiki.revision.rev_page |    1 | Using index              |
+------+-------------+----------+--------+---------------------------------------------------+---------+---------+--------------------------+------+--------------------------+
2 rows in set (0.00 sec)

mysql:wikiadmin@db1080 [enwiki]>
Anomie added a subscriber: Anomie.Jul 13 2016, 11:43 PM

I've seen a bug like this before... There it is: T131026#2187446. Apparently having an int-as-string that exceeds the size of the database field makes MySQL decide it can't use the index, so it makes a stupid query plan.

I think I have nothing to do here, you all solved it before I could have a look at it.

Now, I am unsure about the perfect solution- ids are unsigned ints, if you cast it as a long, how do you handle errors? Would it be worth to run min(rev_id), max(rev_id) within the transaction (which are instant returns) to check the values? What is the right reaction, fail or transparently eat malformed parameter? I would support to send (sanitized) ints whenever possible.

Is it a malformed parameter? Are there some docs that specify that the int must be less than 2^32 = 4,294,967,296? It seems that, from the MediaWiki API's point of view, an int is an Integer -- not some digital formalization limited to 32 bits.

It seems that, if you can somehow provide the value as a number in the MySQL query (rather than a string), MySQL handles it just fine. It seems that this is probably the right solution to the problem.

Change 299018 had a related patch set uploaded (by Anomie):
API: Filter lists of IDs before sending them to the database

https://gerrit.wikimedia.org/r/299018

jcrespo added a comment.EditedJul 14 2016, 10:07 PM

Is it a malformed parameter? Are there some docs that specify that the int must be less than 2^32 = 4,294,967,296? It seems that, from the MediaWiki API's point of view, an int is an Integer -- not some digital formalization limited to 32 bits.

At this moment, revision ids cannot be larger than 2^32, because right now they are defined as rev_id int unsigned on the database. Whether that is correct or will change in the future, I do not know, I was only stating a fact. If they are larger than that, at this moment we know they cannot exist. Notice I suggested comparing them to max(id) not to maxint (unsigned) knowing that can change in the future.

My question was what to return if a non-existent id was queried? Error or ignore it? While ignore it may be the obvious solution (maybe even the documented one), there is an argument for the opposite: https://en.wikipedia.org/wiki/Fail-fast or to cut options to avoid excessive, useless processing.

Re. "ignore it", that'd be fine if so long as the response includes the "badrevids" parameter for it.

My question was what to return if a non-existent id was queried? Error or ignore it?

The thing to do is to do the same thing the API module does if given an in-range but invalid ID. Sometimes that means "ignore it", but for revids it means to include it in a "badrevids" list in the response.

https://gerrit.wikimedia.org/r/299018 takes the approach of specifically filtering the bad IDs out of the database query, leaving it to the existing code to handle it in the same way it handles any other ID that doesn't result in a row being returned from the database.

Thank you for your answer!

1978Gage2001 moved this task from Triage to In progress on the DBA board.Dec 11 2017, 9:45 AM
Marostegui moved this task from In progress to Triage on the DBA board.Dec 11 2017, 10:57 AM