Page MenuHomePhabricator

Long running queries from pltools unlikely to finish
Open, Needs TriagePublic

Description

There are several queries that are unlikely to finish, ever, on analytics wikireplicas:

While those servers are there to go long-running queries, there is no reason to run them if they are not going to finish ever but they create issues for other users- the longer they go on, the slower the servers become for other users:

labsdb1010	4134749	s51591	dbproxy1010	wikidatawiki_p	2d
SELECT ips_item_id, count(*) AS cnt FROM wb_items_per_site WHERE CONCAT("Q", ips_item_id) IN (SELECT page_title FROM page JOIN pagelinks ON page_id=pl_from WHERE pl_title="Q937857") AND ips_item_id NOT IN (SELECT ips_item_id FROM wb_items_per_site WHERE ips_site_id="enwiki") GROUP BY ips_item_id ORDER BY cnt DESC LIMIT 100

labsdb1010	4729760	s51591	dbproxy1010	wikidatawiki_p	1d
SELECT page_title FROM page WHERE page_namespace = 0 AND page_title NOT IN (SELECT term_full_entity_id FROM wb_terms) AND page_title NOT IN (SELECT CONCAT("Q", ips_item_id) FROM wb_items_per_site) AND page_is_redirect = 0

labsdb1010	5067887	s51591	dbproxy1010	wikidatawiki_p	6h
SELECT DISTINCT a.term_entity_id, b.term_entity_id FROM wb_terms AS a JOIN wb_terms AS b ON a.term_text=b.term_text WHERE a.term_type="label" AND b.term_type="label" AND a.term_entity_type="item" AND b.term_entity_type="item" AND a.term_entity_id <> b.term_entity_id AND a.term_language = "en" AND b.term_language <> "en" AND a.term_full_entity_id IN (SELECT page_title FROM page JOIN pagelinks ON page_id = pl_from WHERE pl_title="Q4167410") AND b.term_entity_id IN (SELECT page_title FROM page JOIN pagelinks ON page_id = pl_from WHERE pl_title="Q4167410") LIMIT 2000

You should use EXPLAIN on those queries to understand why that is happening and how they could be made in smaller chunks so they can finish. Those selects without where seem highly inefficient.

See also T114904: Migrate wb_items_per_site to using prefixed entity IDs instead of numeric IDs

Event Timeline

The user did not respond but the queries kept retrying, probably unattended, I am going to kill them until I get his/her attention.

I can not use EXPLAIN:

ERROR 1345 (HY000): EXPLAIN/SHOW can not be issued; lacking privileges for underlying table

The first query I could rewrite to SPARQL and execute now on query.wikidata.org.
The second query I tried to rewrite with NOT EXISTS instead of NOT IN but did not speed up. I need to have T114904 been resolved. Till then, I deactivated the script which calls this query.

I can not use EXPLAIN:

How to execute EXPLAIN is described in T50875#2845764. Alternatively Quarry can provide an easy interface to EXPLAIN queries.

@jcrespo see my explanation at T114904 . It used to be a clean join, but with the full entity_id available the only option is a very slow and dirty CONCAT("Q", ips_item_id) .

The second query can be expressed as:

SELECT itemid FROM (SELECT SUBSTRING(page_title, 1) AS itemid FROM page WHERE page_namespace = 0 AND page_title NOT IN (SELECT term_full_entity_id FROM wb_terms) AND page_is_redirect = 0) AS woterm LEFT JOIN wb_items_per_site ON ips_item_id = itemid WHERE ips_item_id IS NULL LIMIT 10;

That should make it much more efficient… the main culprit is the NOT IN wb_terms then, not the wbs_items_per_site one. The old query was probably faster nevertheless as it (more efficiently) filtered against wb_items_per_site first and then filtered via wb_terms.