Page MenuHomePhabricator

BM25: initial limited release into production
Closed, ResolvedPublic

Description

Now that we've done extensive testing on the new query scoring method called BM25, we want to do an initial and limited release into production. We'll be doing this release for the top 10 languages as follows:

  • English, German, Spanish, Russian, Portuguese, French, Italian, Polish, Dutch, Arabic

We are purposely not releasing BM25 onto wikis that don't have spaces between words (such as Chinese, Japanese, Thai and Khmer for starters). We have tickets to investigate how best to utilize BM25 on those types of languages that don't have spaces between words: T147495 and T147501

Plan to enable BM25 on these wikis:

  • [config] Disable BM25 A/B test on enwiki and prepare an A/B test for ja, zh and th: patch
  • [cirrus] Add support for routing completion queries to a specific cluster: patch
  • [config] Add new vars in InititliazeSettings.php for BM25 but only activate the SimilarityConfig for these wikis: patch
  • [maint] Reindex codfw with BM25
  • [config] Switch default cluster to codfw for these wikis and keep completion queries to eqiad: patch
  • [maint] Reindex eqiad with BM25
  • [config] Switch back default cluster to eqiad for these wikis: patch

Event Timeline

Change 315250 had a related patch set uploaded (by DCausse):
[cirrus] remove cirrus BM25 A/B test config

https://gerrit.wikimedia.org/r/315250

Change 315297 had a related patch set uploaded (by DCausse):
[cirrus] Activate BM25 on top 10 wikis: Step 1

https://gerrit.wikimedia.org/r/315297

Change 315298 had a related patch set uploaded (by DCausse):
[cirrus] Activate BM25 on top 10 wikis: Step 2

https://gerrit.wikimedia.org/r/315298

Change 315299 had a related patch set uploaded (by DCausse):
[cirrus] Activate BM25 on top 10 wikis: Step 3

https://gerrit.wikimedia.org/r/315299

I merged the patch for allowing completion overrides and cherry-picked it to wmf.22, so that will be out in this weeks train.

Overall the plan looks good to me.

Change 315250 merged by jenkins-bot:
[cirrus] switch cirrus BM25 A/B test config to ja, zh, th

https://gerrit.wikimedia.org/r/315250

Change 315297 merged by jenkins-bot:
[cirrus] Activate BM25 on top 10 wikis: Step 1

https://gerrit.wikimedia.org/r/315297

Mentioned in SAL (#wikimedia-operations) [2016-10-12T18:10:22Z] <ebernhardson@mira> Synchronized wmf-config/CirrusSearch-common.php: SWAT T147508 Activate BM25 on top 10 wikis: Step 1 (duration: 00m 50s)

Mentioned in SAL (#wikimedia-operations) [2016-10-14T12:48:17Z] <dcausse> reindexing top 10 wikipedias with BM25 on elastic@codfw from terbium (logs in ~dcausse/bm25_reindex/cirrus_log/) (T147508)

Change 315298 merged by jenkins-bot:
[cirrus] Activate BM25 on top 10 wikis: Step 2

https://gerrit.wikimedia.org/r/315298

Mentioned in SAL (#wikimedia-operations) [2016-10-20T18:30:04Z] <dereckson@mira> Synchronized wmf-config/InitialiseSettings.php: Activate Cirrus BM25 algo on top 10 wikis (step 2, T147508) (duration: 00m 50s)

Change 317159 had a related patch set uploaded (by DCausse):
[cirrus] Activate BM25 on top 10 wikis: Step 2 (take 2)

https://gerrit.wikimedia.org/r/317159

First attempt to activate BM25 failed due to some errors see T148840

Change 317159 merged by jenkins-bot:
[cirrus] Activate BM25 on top 10 wikis: Step 2 (take 2)

https://gerrit.wikimedia.org/r/317159

Mentioned in SAL (#wikimedia-operations) [2016-10-24T19:30:09Z] <thcipriani@mira> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:317159|[cirrus] Activate BM25 on top 10 wikis: Step 2 (take 2) (T147508)]] (duration: 00m 50s)

Mentioned in SAL (#wikimedia-operations) [2016-10-25T11:53:14Z] <dcausse> elastic@eqiad reindexing top10 wikis with BM25 from terbium T147508 (logs in ~dcausse/bm25_reindex/cirrus_log)

Mentioned in SAL (#wikimedia-operations) [2016-10-25T19:41:07Z] <dcausse> elastic@eqiad reindexing enwiki with BM25 from terbium T147508 (logs in ~dcausse/bm25_reindex/cirrus_log)

Mentioned in SAL (#wikimedia-operations) [2016-10-26T08:25:10Z] <dcausse> elastic@eqiad reindexing enwiki (take 3) with BM25 from wasat.codfw.wmnet T147508 (logs in ~dcausse/bm25_reindex/cirrus_log)

Change 315299 merged by jenkins-bot:
[cirrus] Activate BM25 on top 10 wikis: Step 3

https://gerrit.wikimedia.org/r/315299

Change 318356 had a related patch set uploaded (by DCausse):
[cirrus] Activate BM25 on top 10 wikis: Step 3

https://gerrit.wikimedia.org/r/318356

Change 318356 merged by jenkins-bot:
[cirrus] Activate BM25 on top 10 wikis: Step 3

https://gerrit.wikimedia.org/r/318356

Mentioned in SAL (#wikimedia-operations) [2016-11-09T14:41:59Z] <zfilipin@tin> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:318356|[cirrus] Activate BM25 on top 10 wikis: Step 3 (T147508)]] (duration: 00m 48s)

Deskana subscribed.

Added this to the status update this week because I forgot to add it to the one last week.