Page MenuHomePhabricator

Commons search seems to have stopped indexing statements since 30 October 2019
Closed, ResolvedPublic

Details

Related Gerrit Patches:
mediawiki/extensions/CirrusSearch : wmf/1.35.0-wmf.5Restore CirrusSearchBuildDocumentParse hook
mediawiki/extensions/CirrusSearch : masterRestore CirrusSearchBuildDocumentParse hook

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSun, Nov 10, 11:25 AM
EBernhardson added a comment.EditedTue, Nov 12, 10:48 PM

Hmm, in all of the linked pages the version numbers in elasticsearch match the revision id of the latest page, so updates are making it through.

Using one of the example pages: https://commons.wikimedia.org/wiki/File:Group_of_Hetepheres_II_and_Meresankh_III-30.1456-IMG_4559-gradient.jpg

We can ask cirrus to build a new document without writing it to elastic (test the pipeline essentially): https://commons.wikimedia.org/w/api.php?action=query&prop=cirrusbuilddoc&titles=File:Group_of_Hetepheres_II_and_Meresankh_III-30.1456-IMG_4559-gradient.jpg

Indeed this does not contain statement_keywords, even though the file has P180=Q74377458. Will look closer into this and see where it's gone haywire.

Change 550578 had a related patch set uploaded (by EBernhardson; owner: EBernhardson):
[mediawiki/extensions/CirrusSearch@master] Restore CirrusSearchBuildDocumentParser hook

https://gerrit.wikimedia.org/r/550578

Change 550578 merged by jenkins-bot:
[mediawiki/extensions/CirrusSearch@master] Restore CirrusSearchBuildDocumentParse hook

https://gerrit.wikimedia.org/r/550578

Change 550769 had a related patch set uploaded (by EBernhardson; owner: EBernhardson):
[mediawiki/extensions/CirrusSearch@wmf/1.35.0-wmf.5] Restore CirrusSearchBuildDocumentParse hook

https://gerrit.wikimedia.org/r/550769

Change 550769 merged by jenkins-bot:
[mediawiki/extensions/CirrusSearch@wmf/1.35.0-wmf.5] Restore CirrusSearchBuildDocumentParse hook

https://gerrit.wikimedia.org/r/550769

Mentioned in SAL (#wikimedia-operations) [2019-11-14T00:36:47Z] <ebernhardson@deploy1001> Synchronized php-1.35.0-wmf.5/extensions/CirrusSearch/includes/BuildDocument/BuildDocument.php: T237849: Restore CirrusSearchBuildDocumentParse hook (duration: 00m 54s)

Mentioned in SAL (#wikimedia-operations) [2019-11-14T00:41:06Z] <ebernhardson> T237849 Start CirrusSearch forceSearchIndex.php commonswiki 2019-10-20T00:00:00 - 2019-11-14T01:00:00 pushing into jobqueue

Backfill has completed, this should be resolved.

I can confirm this is resolved. Just did some edits at https://commons.wikimedia.org/w/index.php?title=File:20120922-Collse_Watermolen-042.jpg&action=history and I see the statements in https://commons.wikimedia.org/w/index.php?title=File:20120922-Collse_Watermolen-042.jpg&action=cirrusdump :

statement_keywords
0 "P180=Q2117023"
1 "P7482=Q66458942"
2 "P6216=Q50423863"
3 "P275=Q18195572"

This task can probably be closed. Thanks for fixing.

Ramsey-WMF closed this task as Resolved.Fri, Nov 15, 5:39 PM

Tested, works.