From logstash:
Expectation (readQueryTime <= 5) by MediaWiki::main not met (actual: 5.7761619091034): query-m: SELECT DISTINCT mvi_sha1 AS `value` FROM `machine_vision_image` INNER JOIN `machine_vision_label` ON ((mvi_id = mvl_mvi_id)) WHERE mvl_review = N AND (mvi_rand > N.N) ORDER BY mvi_rand ASC LIMIT N [TRX#f71cbb] #0 /srv/mediawiki/php-1.35.0-wmf.16/includes/libs/rdbms/TransactionProfiler.php(252): Wikimedia\Rdbms\TransactionProfiler->reportExpectationViolated('readQueryTime', Object(Wikimedia\Rdbms\GeneralizedSql), 5.7761619091034) #1 /srv/mediawiki/php-1.35.0-wmf.16/includes/libs/rdbms/database/Database.php(1344): Wikimedia\Rdbms\TransactionProfiler->recordQueryCompletion(Object(Wikimedia\Rdbms\GeneralizedSql), 1580247583.2197, false, 10) #2 /srv/mediawiki/php-1.35.0-wmf.16/includes/libs/rdbms/database/Database.php(1226): Wikimedia\Rdbms\Database->executeQueryAttempt('SELECT DISTINCT...', 'SELECT /* Media...', false, 'MediaWiki\\Exten...', 0) #3 /srv/mediawiki/php-1.35.0-wmf.16/includes/libs/rdbms/database/Database.php(1162): Wikimedia\Rdbms\Database->executeQuery('SELECT DISTINCT...', 'MediaWiki\\Exten...', 0) #4 /srv/mediawiki/php-1.35.0-wmf.16/includes/libs/rdbms/database/Database.php(1828): Wikimedia\Rdbms\Database->query('SELECT DISTINCT...', 'MediaWiki\\Exten...') #5 /srv/mediawiki/php-1.35.0-wmf.16/includes/libs/rdbms/database/Database.php(1691): Wikimedia\Rdbms\Database->select(Array, Array, Array, 'MediaWiki\\Exten...', Array, Array) #6 /srv/mediawiki/php-1.35.0-wmf.16/includes/libs/rdbms/database/DBConnRef.php(68): Wikimedia\Rdbms\Database->selectFieldValues(Array, 'mvi_sha1', Array, 'MediaWiki\\Exten...', Array, Array) #7 /srv/mediawiki/php-1.35.0-wmf.16/includes/libs/rdbms/database/DBConnRef.php(311): Wikimedia\Rdbms\DBConnRef->__call('selectFieldValu...', Array) #8 /srv/mediawiki/php-1.35.0-wmf.16/extensions/MachineVision/src/Repository.php(292): Wikimedia\Rdbms\DBConnRef->selectFieldValues(Array, 'mvi_sha1', Array, 'MediaWiki\\Exten...', Array, Array) #9 /srv/mediawiki/php-1.35.0-wmf.16/extensions/MachineVision/src/Repository.php(296): MediaWiki\Extension\MachineVision\Repository->MediaWiki\Extension\MachineVision\{closure}(true, 10, Array) #10 /srv/mediawiki/php-1.35.0-wmf.16/extensions/MachineVision/src/Special/SpecialSuggestedTags.php(70): MediaWiki\Extension\MachineVision\Repository->getTitlesWithUnreviewedLabels(10) #11 /srv/mediawiki/php-1.35.0-wmf.16/extensions/MachineVision/src/Special/SpecialSuggestedTags.php(46): MediaWiki\Extension\MachineVision\Special\SpecialSuggestedTags->getInitialSuggestedTagsData() #12 /srv/mediawiki/php-1.35.0-wmf.16/includes/specialpage/SpecialPage.php(575): MediaWiki\Extension\MachineVision\Special\SpecialSuggestedTags->execute(NULL) #13 /srv/mediawiki/php-1.35.0-wmf.16/includes/specialpage/SpecialPageFactory.php(611): SpecialPage->run(NULL) #14 /srv/mediawiki/php-1.35.0-wmf.16/includes/MediaWiki.php(298): MediaWiki\Special\SpecialPageFactory->executePath(Object(Title), Object(RequestContext)) #15 /srv/mediawiki/php-1.35.0-wmf.16/includes/MediaWiki.php(967): MediaWiki->performRequest() #16 /srv/mediawiki/php-1.35.0-wmf.16/includes/MediaWiki.php(530): MediaWiki->main() #17 /srv/mediawiki/php-1.35.0-wmf.16/index.php(46): MediaWiki->run() #18 /srv/mediawiki/w/index.php(3): require('/srv/mediawiki/...') #19 {main}
The problem seems to be the DISTINCT:
> EXPLAIN SELECT DISTINCT(mvi_sha1) AS `value` FROM `machine_vision_image` INNER JOIN `machine_vision_label` ON ((mvi_id = mvl_mvi_id)) WHERE mvl_review = 0 AND (mvi_rand > 0.5) ORDER BY mvi_rand ASC LIMIT 100;; stdClass Object ( [id] => 1 [select_type] => SIMPLE [table] => machine_vision_image [type] => range [possible_keys] => PRIMARY,mvi_rand [key] => mvi_rand [key_len] => 4 [ref] => [rows] => 134357 [Extra] => Using index condition; Using temporary; Using filesort ) stdClass Object ( [id] => 1 [select_type] => SIMPLE [table] => machine_vision_label [type] => ref [possible_keys] => mvl_mvi_wikidata [key] => mvl_mvi_wikidata [key_len] => 4 [ref] => commonswiki.machine_vision_image.mvi_id [rows] => 4 [Extra] => Using where; Distinct )
> EXPLAIN SELECT mvi_sha1 AS `value` FROM `machine_vision_image` INNER JOIN `machine_vision_label` ON ((mvi_id = mvl_mvi_id)) WHERE mvl_review = 0 AND (mvi_rand > 0.5) ORDER BY mvi_rand ASC LIMIT 10; stdClass Object ( [id] => 1 [select_type] => SIMPLE [table] => machine_vision_image [type] => range [possible_keys] => PRIMARY,mvi_rand [key] => mvi_rand [key_len] => 4 [ref] => [rows] => 134357 [Extra] => Using index condition ) stdClass Object ( [id] => 1 [select_type] => SIMPLE [table] => machine_vision_label [type] => ref [possible_keys] => mvl_mvi_wikidata [key] => mvl_mvi_wikidata [key_len] => 4 [ref] => commonswiki.machine_vision_image.mvi_id [rows] => 4 [Extra] => Using where )
> select count(*) from machine_vision_label where mvl_review = 0 limit 1000;; stdClass Object ( [count(*)] => 2086405 ) > select count(*) from machine_vision_label where mvl_review = 1; stdClass Object ( [count(*)] => 6625 )
How many duplicate entries are there (roughly) for each SHA-1? Maybe the LIMIT can be the number of desired results times a multiplier and the DISTINCT can be removed. Any duplicates can be filtered in PHP in that case. Any excess results would be discarded in PHP rather than mysql.