Split from T18112#201167 as this is a problem in the script itself.
(Reedy said in T18112#201188)
The original queries take an age, and isn't going to attempt to load it all.
mysql> explain select DISTINCT pl_from from pagelinks LEFT JOIN page ON pl_from=page_id; +----+-------------+-----------+--------+---------------+---------+---------+--------------------------+-----------+------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-----------+--------+---------------+---------+---------+--------------------------+-----------+------------------------------+ | 1 | SIMPLE | pagelinks | index | NULL | pl_from | 265 | NULL | 624327870 | Using index; Using temporary | | 1 | SIMPLE | page | eq_ref | PRIMARY | PRIMARY | 4 | enwiki.pagelinks.pl_from | 1 | Using index; Distinct | +----+-------------+-----------+--------+---------------+---------+---------+--------------------------+-----------+------------------------------+ 2 rows in set (0.01 sec)Removing the distinct would make things simpler.. If kept a client side count,
and removed the distint... Would this work for us..
Version: unspecified
Severity: normal