Page MenuHomePhabricator

BM25: initial limited release into production
Closed, ResolvedPublic

Description

Now that we've done extensive testing on the new query scoring method called BM25, we want to do an initial and limited release into production. We'll be doing this release for the top 10 languages as follows:

  • English, German, Spanish, Russian, Portuguese, French, Italian, Polish, Dutch, Arabic

We are purposely not releasing BM25 onto wikis that don't have spaces between words (such as Chinese, Japanese, Thai and Khmer for starters). We have tickets to investigate how best to utilize BM25 on those types of languages that don't have spaces between words: T147495 and T147501

Plan to enable BM25 on these wikis:

  • [config] Disable BM25 A/B test on enwiki and prepare an A/B test for ja, zh and th: patch
  • [cirrus] Add support for routing completion queries to a specific cluster: patch
  • [config] Add new vars in InititliazeSettings.php for BM25 but only activate the SimilarityConfig for these wikis: patch
  • [maint] Reindex codfw with BM25
  • [config] Switch default cluster to codfw for these wikis and keep completion queries to eqiad: patch
  • [maint] Reindex eqiad with BM25
  • [config] Switch back default cluster to eqiad for these wikis: patch

Event Timeline

TJones updated the task description. (Show Details)Oct 6 2016, 5:52 PM
dcausse updated the task description. (Show Details)Oct 11 2016, 12:51 PM

Change 315250 had a related patch set uploaded (by DCausse):
[cirrus] remove cirrus BM25 A/B test config

https://gerrit.wikimedia.org/r/315250

Change 315297 had a related patch set uploaded (by DCausse):
[cirrus] Activate BM25 on top 10 wikis: Step 1

https://gerrit.wikimedia.org/r/315297

Change 315298 had a related patch set uploaded (by DCausse):
[cirrus] Activate BM25 on top 10 wikis: Step 2

https://gerrit.wikimedia.org/r/315298

Change 315299 had a related patch set uploaded (by DCausse):
[cirrus] Activate BM25 on top 10 wikis: Step 3

https://gerrit.wikimedia.org/r/315299

dcausse updated the task description. (Show Details)Oct 11 2016, 4:25 PM

I merged the patch for allowing completion overrides and cherry-picked it to wmf.22, so that will be out in this weeks train.

Overall the plan looks good to me.

dcausse updated the task description. (Show Details)Oct 12 2016, 8:47 AM

Change 315250 merged by jenkins-bot:
[cirrus] switch cirrus BM25 A/B test config to ja, zh, th

https://gerrit.wikimedia.org/r/315250

Change 315297 merged by jenkins-bot:
[cirrus] Activate BM25 on top 10 wikis: Step 1

https://gerrit.wikimedia.org/r/315297

Mentioned in SAL (#wikimedia-operations) [2016-10-12T18:10:22Z] <ebernhardson@mira> Synchronized wmf-config/CirrusSearch-common.php: SWAT T147508 Activate BM25 on top 10 wikis: Step 1 (duration: 00m 50s)

dcausse updated the task description. (Show Details)Oct 14 2016, 12:47 PM

Mentioned in SAL (#wikimedia-operations) [2016-10-14T12:48:17Z] <dcausse> reindexing top 10 wikipedias with BM25 on elastic@codfw from terbium (logs in ~dcausse/bm25_reindex/cirrus_log/) (T147508)

dcausse updated the task description. (Show Details)Oct 17 2016, 3:52 PM

Change 315298 merged by jenkins-bot:
[cirrus] Activate BM25 on top 10 wikis: Step 2

https://gerrit.wikimedia.org/r/315298

Mentioned in SAL (#wikimedia-operations) [2016-10-20T18:30:04Z] <dereckson@mira> Synchronized wmf-config/InitialiseSettings.php: Activate Cirrus BM25 algo on top 10 wikis (step 2, T147508) (duration: 00m 50s)

Change 317159 had a related patch set uploaded (by DCausse):
[cirrus] Activate BM25 on top 10 wikis: Step 2 (take 2)

https://gerrit.wikimedia.org/r/317159

dcausse updated the task description. (Show Details)Oct 21 2016, 2:57 PM

First attempt to activate BM25 failed due to some errors see T148840

Change 317159 merged by jenkins-bot:
[cirrus] Activate BM25 on top 10 wikis: Step 2 (take 2)

https://gerrit.wikimedia.org/r/317159

Mentioned in SAL (#wikimedia-operations) [2016-10-24T19:30:09Z] <thcipriani@mira> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:317159|[cirrus] Activate BM25 on top 10 wikis: Step 2 (take 2) (T147508)]] (duration: 00m 50s)

dcausse updated the task description. (Show Details)Oct 25 2016, 9:12 AM

Mentioned in SAL (#wikimedia-operations) [2016-10-25T11:53:14Z] <dcausse> elastic@eqiad reindexing top10 wikis with BM25 from terbium T147508 (logs in ~dcausse/bm25_reindex/cirrus_log)

Mentioned in SAL (#wikimedia-operations) [2016-10-25T19:41:07Z] <dcausse> elastic@eqiad reindexing enwiki with BM25 from terbium T147508 (logs in ~dcausse/bm25_reindex/cirrus_log)

Mentioned in SAL (#wikimedia-operations) [2016-10-26T08:25:10Z] <dcausse> elastic@eqiad reindexing enwiki (take 3) with BM25 from wasat.codfw.wmnet T147508 (logs in ~dcausse/bm25_reindex/cirrus_log)

dcausse updated the task description. (Show Details)Oct 26 2016, 4:58 PM

Change 315299 merged by jenkins-bot:
[cirrus] Activate BM25 on top 10 wikis: Step 3

https://gerrit.wikimedia.org/r/315299

Change 318356 had a related patch set uploaded (by DCausse):
[cirrus] Activate BM25 on top 10 wikis: Step 3

https://gerrit.wikimedia.org/r/318356

dcausse updated the task description. (Show Details)Oct 27 2016, 7:41 PM

Change 318356 merged by jenkins-bot:
[cirrus] Activate BM25 on top 10 wikis: Step 3

https://gerrit.wikimedia.org/r/318356

Mentioned in SAL (#wikimedia-operations) [2016-11-09T14:41:59Z] <zfilipin@tin> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:318356|[cirrus] Activate BM25 on top 10 wikis: Step 3 (T147508)]] (duration: 00m 48s)

Deskana closed this task as Resolved.Nov 17 2016, 9:44 PM
Deskana added a subscriber: Deskana.

Added this to the status update this week because I forgot to add it to the one last week.