Page MenuHomePhabricator

dumpIndex produces partial dumps
Closed, ResolvedPublic

Description

Since the elastic5 upgrade the dumpIndex script seems to produce partial dumps :

dcausse@wasat:~$ mwscript extensions/CirrusSearch/maintenance/dumpIndex.php --wiki enwiktionary --indexType content | pv -i5 -l | wc -l                                               
Dumping 5102492 documents (5102492 in the index)
1.02M 0:38:38 [ 440 /s] [                                            <=>                  ]
1020500

5102492 docs must be dumped but the script produces 1M lines, given that we have 2 lines per doc it's only 500k docs.

Event Timeline

dcausse created this task.Apr 11 2017, 8:53 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 11 2017, 8:53 AM
dcausse claimed this task.Apr 11 2017, 12:13 PM
dcausse moved this task from needs triage to Current work on the Discovery-Search board.

Change 347601 had a related patch set uploaded (by DCausse):
[mediawiki/extensions/CirrusSearch@master] Fix dumpIndex.php

https://gerrit.wikimedia.org/r/347601

Change 347601 merged by jenkins-bot:
[mediawiki/extensions/CirrusSearch@master] Fix dumpIndex.php

https://gerrit.wikimedia.org/r/347601

debt closed this task as Resolved.May 30 2017, 5:28 PM
debt added a subscriber: debt.

All fixed now