Avoid joining three tables just to get the latest revision ID
Closed, ResolvedPublic

Description

WBQC uses WikiPageEntityMetaDataLookup to get the latest revision IDs for a list of entity IDs, but that lookup actually gets a lot more information than we need, joining three large tables (page, revision, text) in the process. We don’t need that – for just the revision IDs, page.page_latest should be enough.

Part of https://wikitech.wikimedia.org/wiki/Incident_documentation/20180226-WikibaseQualityConstraints.

Patch-For-Review:

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 27 2018, 10:14 AM

(Note: EntityRevisionLookup::getLatestRevisionId can also be used to get the latest revision ID for a single entity ID, and we actually used that prior to T182994: Get entity revision IDs in bulk. However, @daniel said that this is probably a legacy part of that interface, and the proper way to fix this task is to add a new method to WikiPageEntityMetaDataAccessor + …Lookup, not to extend EntityRevisionLookup.)

Change 415258 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] Add WikiPageEntityMetaDataAccessor::loadPageLatest

https://gerrit.wikimedia.org/r/415258

Change 415277 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] Extract two utility functions

https://gerrit.wikimedia.org/r/415277

Change 415278 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] Optimize WikiPageEntityMetaDataLookup::loadPageLatest

https://gerrit.wikimedia.org/r/415278

Change 415258 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Add WikiPageEntityMetaDataAccessor::loadLatestRevisionIds

https://gerrit.wikimedia.org/r/415258

Change 415277 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Extract two utility functions

https://gerrit.wikimedia.org/r/415277

thiemowmde triaged this task as Low priority.Mar 2 2018, 3:14 PM
thiemowmde added a project: Technical-Debt.

Change 415865 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseQualityConstraints@master] Use WikiPageEntityMetaDataAcessor::loadLatestRevisionIds

https://gerrit.wikimedia.org/r/415865

Change 415901 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] WIP: Optimize PrefetchingWPEMDA::loadLatestRevisionIds

https://gerrit.wikimedia.org/r/415901

I just saw this in WikiPageEntityRevisionLookup:

public function getLatestRevisionId( EntityId $entityId, $mode = self::LATEST_FROM_REPLICA ) {
    $rows = $this->entityMetaDataAccessor->loadRevisionInformation( [ $entityId ], $mode );
    $row = $rows[$entityId->getSerialization()];

    if ( $row && $row->page_latest && !$row->page_is_redirect ) {
        return (int)$row->page_latest;
    }

    return false;
}

Note that it’s only using page_latest if the page is not a redirect. We’re not currently doing that in loadLatestRevisionIds… is that a problem?

Change 416410 had a related patch set uploaded (by Thiemo Kreuz (WMDE); owner: Thiemo Kreuz (WMDE)):
[mediawiki/extensions/Wikibase@master] Rearrange PrefetchingWPEMDA::loadLatestRevisionIds

https://gerrit.wikimedia.org/r/416410

Change 415278 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Optimize WikiPageEntityMetaDataLookup::loadLatestRevisionIds

https://gerrit.wikimedia.org/r/415278

Change 415865 merged by jenkins-bot:
[mediawiki/extensions/WikibaseQualityConstraints@master] Use WikiPageEntityMetaDataAcessor::loadLatestRevisionIds

https://gerrit.wikimedia.org/r/415865

Change 416997 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] Handle redirects correctly in loadLatestRevisionIds

https://gerrit.wikimedia.org/r/416997

Change 415901 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Optimize PrefetchingWPEMDA::loadLatestRevisionIds

https://gerrit.wikimedia.org/r/415901

Change 416410 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Rearrange PrefetchingWPEMDA::loadLatestRevisionIds

https://gerrit.wikimedia.org/r/416410

Change 416997 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Handle redirects correctly in loadLatestRevisionIds

https://gerrit.wikimedia.org/r/416997

Lucas_Werkmeister_WMDE closed this task as Resolved.