Page MenuHomePhabricator

Reindex elasticsearch indices that havn't been re-created with 2.x
Closed, ResolvedPublic

Description

Indices created before v2.0.0 must be reindexed with the Reindex Helper.

Combined list of wikis between eqiad and codfw clusters that have not been created by v2.0.0:

  • azbwiki
  • bowiki
  • bowikibooks
  • bowiktionary
  • bugwiki
  • cdowiki
  • cnwikimedia
  • crwiki
  • crwikiquote
  • crwiktionary
  • dzwiki
  • dzwiktionary
  • ganwiki
  • hakwiki
  • jawiki
  • jawikibooks
  • jawikinews
  • jawikiquote
  • jawikisource
  • jawikiversity
  • jawiktionary
  • jvwiki
  • jvwiktionary
  • kmwiki
  • kmwikibooks
  • kmwiktionary
  • lowiki
  • lowiktionary
  • mw_cirrus_versions
  • mywiki
  • mywikibooks
  • mywiktionary
  • thwiki
  • thwikibooks
  • thwikinews
  • thwikiquote
  • thwikisource
  • thwiktionary
  • ttmserver
  • ttmserver-test
  • vewikimedia
  • wuuwiki
  • zh_classicalwiki
  • zh_min_nanwiki
  • zh_min_nanwikibooks
  • zh_min_nanwikiquote
  • zh_min_nanwikisource
  • zh_min_nanwiktionary
  • zhwiki
  • zhwikibooks
  • zhwikinews
  • zhwikiquote
  • zhwikisource
  • zhwikivoyage
  • zhwiktionary
  • zh_yuewiki

Event Timeline

EBernhardson removed a project: Epic.
EBernhardson updated the task description. (Show Details)

Would it make sense to run a big reindex of all the wikis?
Benefits would be to include the reindex needed for icu folding on en, fr, el, he. We could perhaps fix some deprecated settings on the master branch that could cause some warnings (I've seen some discussions about norms)?

I think it would, can start that up today if there are no gotcha's waiting for us.

needed to ship out the icu config change before re-indexing. dcausse did that this morning and will double check the completion search builds correctly this time around.

We also need to ship a patch to rename the default similarity to BM25, as the upgrade from 2.x to 5.x will rename anything with a similarity named default to classic, reverting us to tf-idf.

The patch to fix similarity profile is deployed, I checked titlesuggest indices and some of them are failing. After a quick look it does not seem to be related to ICU, some of these indices are broken since Dec 14, 2016...
I uploaded a patch to try to workaround the problem but I think we can move forward with the reindexing.

I've started the reindex for eqiad and codfw in parallel on terbium in a tmux sessions for all wikis. No clue how long to expect this to take, going to throw out a week as a random guess.

A few indices not managed in the same way need to be recreated:

  • ttmserver
  • ttmserver-test
  • mw_cirrus_versions

Once this is done we can merge https://gerrit.wikimedia.org/r/#/c/320788/ as well, which was waiting for all indices to have a populated 'wiki' field

Some indices are failing due to master timeout issues on both clusters. This does not have a negative effect on production usage. We will have to review the logs, delete extra unused indices (e.g. index was created, but nothing reindexed into it. Or the reindex completed but the alias was never updated), and re-run the reindex for those wikis. Probably try and do the reindex late sf night/early european morning when the cluster is the least busy to reduce chances of timeout.

I've rebuilt ttmserver indices and mw_cirrus_versions on codfw, this cluster should be all good.
I've scheduled a maintenance update for ttmserver indices in eqiad because I need to run the ttmserver-export.php script (will be run Wed 22 at 9am UTC, should run for 4 hours).

There are still some updateSearchIndexConfig running for eqiad but the state at the time of writing this comment is :
eqiad

for i in `curl -s elastic1020.eqiad.wmnet:9200/_cat/indices?h=i`; do curl -s elastic1020.eqiad.wmnet:9200/$i/ | jq .[].settings.index.version.created | grep -v '"2' > /dev/null && echo $i; done
dzwiki_content_1415175411
dzwiktionary_general_1415175376
jvwiki_general_1414197366
vewikimedia_general_1415331150
ttmserver-test
jvwiktionary_general_1415244778
dzwiki_general_1415175436
jvwiki_content_1414196979
ttmserver
zhwiktionary_general_1415384982
jvwiktionary_content_1415244671
dzwiktionary_content_1415175351
azbwiki_general_first
vewikimedia_content_1415331110

codfw

for i in `curl -s elastic2020.codfw.wmnet:9200/_cat/indices?h=i`; do curl -s elastic2020.codfw.wmnet:9200/$i/ | jq .[].settings.index.version.created | grep -v '"2' > /dev/null && echo $i; done

The initial reindex completed, but unfortunately 50 indices in codfw and 100 indices in eqiad failed to reindex. I deleted unused indices/restarted reindex codfw on only the failed indices saturday, and eqiad sunday. Codfw looks to have completed and eqiad is still running.

ttmserver indices are done remaining indices in eqiad are:

vewikimedia_general_1415331150
zhwiktionary_general_1415384982
azbwiki_general_first
vewikimedia_content_1415331110

this is only by checking version.created in the index settings, it's possible that some reindex threads failed but the count check was good enough to continue, this is in fact the worst scenario as a full reindex would be actually needed to recover missing docs.
When it happened to me for commons I decided to import data from codfw to eqiad instead of running the saneitizer or forceSearchIndex.