Page MenuHomePhabricator

When enabling GeoData and populating coordinates, CirrusSearch needs to bypass ParserCache
Closed, ResolvedPublic

Description

When updating a page, CirrusSearch attempts to get ParserOutput from ParserCache.

private function getContentAndParserOutput( $page ) {                                           
    $content = $page->getContent();                                                             
    $parserOptions = $page->makeParserOptions( 'canonical' );                                   
    $parserOutput = ParserCache::singleton()->get( $page, $parserOptions );                     
    if ( !$parserOutput ) {                                                                     
        // We specify the revision ID here. There might be a newer revision,                    
        // but we don't care because (a) we've already got a job somewhere                      
        // in the queue to index it, and (b) we want magic words like                           
        // {{REVISIONUSER}} to be accurate                                                      
        $revId = $page->getRevision()->getId();                                                 
        $parserOutput = $content->getParserOutput( $page->getTitle(), $revId );                 
    }                                                                                           
    return array( $content, $parserOutput );                                                    
}

For adding coordinates for a wiki with GeoData newly enabled, the ParserOutput would be lacking coordinates if obtained from cache.

either there needs to be a way to force parse or, perhaps if the coordinates are already in geo_tags then maybe a way for coordinates to come from there.

Event Timeline

aude raised the priority of this task from to Medium.
aude updated the task description. (Show Details)
aude added projects: Wikidata, CirrusSearch, GeoData.
aude subscribed.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

even forceSearchIndex.php does not result in coodinates added for a cached page :(

Change 248345 had a related patch set (by Aude) published:
Add --forceParse UpdaterFlag and option in forceSearchIndex script

https://gerrit.wikimedia.org/r/248345

aude set Security to None.

Change 248345 merged by jenkins-bot:
Add --forceParse UpdaterFlag and option in forceSearchIndex script

https://gerrit.wikimedia.org/r/248345

aude removed a project: Patch-For-Review.
aude moved this task from Review to Done on the Wikidata-Sprint-2015-10-13 board.

Change 249036 had a related patch set uploaded (by Aude):
Add --forceParse UpdaterFlag and option in forceSearchIndex script

https://gerrit.wikimedia.org/r/249036

Change 249036 merged by jenkins-bot:
Add --forceParse UpdaterFlag and option in forceSearchIndex script

https://gerrit.wikimedia.org/r/249036