Page MenuHomePhabricator

When enabling GeoData and populating coordinates, CirrusSearch needs to bypass ParserCache
Closed, ResolvedPublic

Description

When updating a page, CirrusSearch attempts to get ParserOutput from ParserCache.

private function getContentAndParserOutput( $page ) {                                           
    $content = $page->getContent();                                                             
    $parserOptions = $page->makeParserOptions( 'canonical' );                                   
    $parserOutput = ParserCache::singleton()->get( $page, $parserOptions );                     
    if ( !$parserOutput ) {                                                                     
        // We specify the revision ID here. There might be a newer revision,                    
        // but we don't care because (a) we've already got a job somewhere                      
        // in the queue to index it, and (b) we want magic words like                           
        // {{REVISIONUSER}} to be accurate                                                      
        $revId = $page->getRevision()->getId();                                                 
        $parserOutput = $content->getParserOutput( $page->getTitle(), $revId );                 
    }                                                                                           
    return array( $content, $parserOutput );                                                    
}

For adding coordinates for a wiki with GeoData newly enabled, the ParserOutput would be lacking coordinates if obtained from cache.

either there needs to be a way to force parse or, perhaps if the coordinates are already in geo_tags then maybe a way for coordinates to come from there.

Details

Related Gerrit Patches:
mediawiki/extensions/CirrusSearch : wmf/1.27.0-wmf.3Add --forceParse UpdaterFlag and option in forceSearchIndex script
mediawiki/extensions/CirrusSearch : masterAdd --forceParse UpdaterFlag and option in forceSearchIndex script

Event Timeline

aude created this task.Oct 23 2015, 2:21 PM
aude raised the priority of this task from to Medium.
aude updated the task description. (Show Details)
aude added projects: Wikidata, CirrusSearch, GeoData.
aude added a subscriber: aude.
Restricted Application added a project: Discovery. · View Herald TranscriptOct 23 2015, 2:21 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
aude added a comment.Oct 23 2015, 2:35 PM

even forceSearchIndex.php does not result in coodinates added for a cached page :(

Change 248345 had a related patch set (by Aude) published:
Add --forceParse UpdaterFlag and option in forceSearchIndex script

https://gerrit.wikimedia.org/r/248345

aude claimed this task.Oct 23 2015, 2:47 PM
aude set Security to None.

Change 248345 merged by jenkins-bot:
Add --forceParse UpdaterFlag and option in forceSearchIndex script

https://gerrit.wikimedia.org/r/248345

aude closed this task as Resolved.Oct 23 2015, 4:21 PM
aude removed a project: Patch-For-Review.
aude moved this task from Review to Done on the Wikidata-Sprint-2015-10-13 board.

Change 249036 had a related patch set uploaded (by Aude):
Add --forceParse UpdaterFlag and option in forceSearchIndex script

https://gerrit.wikimedia.org/r/249036

Change 249036 merged by jenkins-bot:
Add --forceParse UpdaterFlag and option in forceSearchIndex script

https://gerrit.wikimedia.org/r/249036