Page MenuHomePhabricator

ApiQueryRevisions::run has bad performance across API hosts
Open, NormalPublic

Description

On working the task T203565 The root cause of different query plans were solved, but now all the hosts have bad query plans for ApiQueryRevisions::run query across S1 API hosts.
After looking it closely it seems if the optimizer got USE INDEX (page_timestamp) or IGNORE INDEX (rev_timestamp) hint, then it will came with a good query plan. See details here: T203565#4600938

Event Timeline

Banyek triaged this task as Normal priority.Sep 20 2018, 9:33 AM
Banyek created this task.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 20 2018, 9:33 AM
Anomie moved this task from Unsorted to Needs Code on the MediaWiki-API board.EditedSep 20 2018, 2:22 PM

This seems like it may be related to T121333, T197486, and/or T88084.

Adding IGNORE INDEX (rev_timestamp) should be pretty straightforward, since no code path through ApiQueryRevisions should be using that index. Given that you seem to be looking at a query that would be better done using page_timestamp, somewhere in this block of code seems likely although it could be put elsewhere in that method too.

Forcing page_timestamp would be less straightforward: depending on the parameters supplied, the queries in that highlighted block[1] might be best served using page_user_timestamp, page_actor_timestamp, or maybe even usertext_timestamp instead. The biggest difficulty is in the one code path where the PHP code can't really choose between page_timestamp and usertext_timestamp.[2]

[1]: And other code paths, not through that block, want PRIMARY or rev_page_id.
[2]: Ideally it'd use a "page_usertext_timestamp" index on (rev_page, rev_user_text, rev_timestamp), but that index doesn't exist. The actor work will eventually eliminate that code path, but getting to that point is still a ways off.

Indeed, I think it is related to those.
However in kibana the errors were dropped: https://logstash.wikimedia.org/app/kibana#/dashboard/DBQuery?_g=h@114715d&_a=h@90b3273

Indeed, I think it is related to those.
However in kibana the errors were dropped: https://logstash.wikimedia.org/app/kibana#/dashboard/DBQuery?_g=h@114715d&_a=h@90b3273

That link doesn't work unfortunately :-(. Use the "share" button on Kibana
Thanks!