Page MenuHomePhabricator

A/B test using defaultsort with the completion suggester
Open, Needs TriagePublic5 Estimated Story Points

Description

Run an A/B test to see whether or not defaultsort can improve user experience when search-as-you-type.
Context
The use of the defaultsort data in completion searches is believed to help in these cases:

  • search for last-name first (previously enabled on the mongolian wikipedia: T327878)
  • search for concepts that are often used with popular prefixes (search XYZ finds List of XYZ, see T386655)

The list of wikipedias where could enable this feature is (filtered on wikipedias that have more than 50% of the pages with a defaultsort that matches the pattern we expect and where it would increase recall on more than 20% of the pages).

wikirecall improvement
enwiki26%
dewiki28%
frwiki23%
eswiki21%
plwiki23%
fiwiki28%
nowiki23%
cswiki23%
hewiki28%
dawiki26%
simplewiki25%
bgwiki20%
etwiki21%
glwiki20%
afwiki33%
slwiki20%
lvwiki20%
vowiki47%
fywiki21%
mnwiki (x)20%
gvwiki25%
mtwiki21%
biwiki37%
iglwiki26%

(x): already enabled in T327878

We could start an A/B test on 3 wikis first: enwiki, frwiki and hewiki.

AC:

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change #1193091 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/mediawiki-config@master] cirrus: test completion with default sort on simplewiki [1/3]

https://gerrit.wikimedia.org/r/1193091

Change #1193092 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/mediawiki-config@master] cirrus: test completion with default sort on simplewiki [2/3]

https://gerrit.wikimedia.org/r/1193092

Change #1193093 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/mediawiki-config@master] cirrus: test completion with default sort on simplewiki [3/3]

https://gerrit.wikimedia.org/r/1193093

Change #1193091 merged by jenkins-bot:

[operations/mediawiki-config@master] cirrus: test completion with default sort on simplewiki [1/3]

https://gerrit.wikimedia.org/r/1193091

Mentioned in SAL (#wikimedia-operations) [2025-10-06T07:53:23Z] <dcausse@deploy2002> Started scap sync-world: Backport for [[gerrit:1193502|Allow AbuseFilter to block on ganwiki (T406220)]], [[gerrit:1193091|cirrus: test completion with default sort on simplewiki [1/3] (T404858)]]

Mentioned in SAL (#wikimedia-operations) [2025-10-06T08:00:14Z] <dcausse@deploy2002> hamishz, dcausse: Backport for [[gerrit:1193502|Allow AbuseFilter to block on ganwiki (T406220)]], [[gerrit:1193091|cirrus: test completion with default sort on simplewiki [1/3] (T404858)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-10-06T08:06:11Z] <dcausse@deploy2002> Finished scap sync-world: Backport for [[gerrit:1193502|Allow AbuseFilter to block on ganwiki (T406220)]], [[gerrit:1193091|cirrus: test completion with default sort on simplewiki [1/3] (T404858)]] (duration: 12m 48s)

Change #1193092 merged by jenkins-bot:

[operations/mediawiki-config@master] cirrus: test completion with default sort on simplewiki [2/3]

https://gerrit.wikimedia.org/r/1193092

Mentioned in SAL (#wikimedia-operations) [2025-10-07T07:05:33Z] <dcausse@deploy2002> Started scap sync-world: Backport for [[gerrit:1193052|cirrus: stop copying ores weighted_tags (T389053)]], [[gerrit:1193092|cirrus: test completion with default sort on simplewiki [2/3] (T404858)]]

Mentioned in SAL (#wikimedia-operations) [2025-10-07T07:11:53Z] <dcausse@deploy2002> dcausse: Backport for [[gerrit:1193052|cirrus: stop copying ores weighted_tags (T389053)]], [[gerrit:1193092|cirrus: test completion with default sort on simplewiki [2/3] (T404858)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-10-07T07:21:05Z] <dcausse@deploy2002> Finished scap sync-world: Backport for [[gerrit:1193052|cirrus: stop copying ores weighted_tags (T389053)]], [[gerrit:1193092|cirrus: test completion with default sort on simplewiki [2/3] (T404858)]] (duration: 15m 32s)

Change #1193093 merged by jenkins-bot:

[operations/mediawiki-config@master] cirrus: test completion with default sort on simplewiki [3/3]

https://gerrit.wikimedia.org/r/1193093

Mentioned in SAL (#wikimedia-operations) [2025-10-14T08:03:32Z] <dcausse@deploy2002> Started scap sync-world: Backport for [[gerrit:1193093|cirrus: test completion with default sort on simplewiki [3/3] (T404858)]], [[gerrit:1195830|ext-EventLogging: Allowlist product_metrics.web_base_with_ip stream (T406332)]]

Mentioned in SAL (#wikimedia-operations) [2025-10-14T08:07:50Z] <dcausse@deploy2002> dcausse, phuedx: Backport for [[gerrit:1193093|cirrus: test completion with default sort on simplewiki [3/3] (T404858)]], [[gerrit:1195830|ext-EventLogging: Allowlist product_metrics.web_base_with_ip stream (T406332)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-10-14T08:14:18Z] <dcausse@deploy2002> Finished scap sync-world: Backport for [[gerrit:1193093|cirrus: test completion with default sort on simplewiki [3/3] (T404858)]], [[gerrit:1195830|ext-EventLogging: Allowlist product_metrics.web_base_with_ip stream (T406332)]] (duration: 10m 46s)

Change #1196064 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/mediawiki-config@master] cirrus: prepare completion search with defaultsort A/B test

https://gerrit.wikimedia.org/r/1196064

Change #1196064 merged by jenkins-bot:

[operations/mediawiki-config@master] cirrus: prepare completion search with defaultsort A/B test

https://gerrit.wikimedia.org/r/1196064

Mentioned in SAL (#wikimedia-operations) [2025-10-21T07:32:38Z] <dcausse@deploy2002> Started scap sync-world: Backport for [[gerrit:1196064|cirrus: prepare completion search with defaultsort A/B test (T404858)]]

Mentioned in SAL (#wikimedia-operations) [2025-10-21T07:37:13Z] <dcausse@deploy2002> dcausse: Backport for [[gerrit:1196064|cirrus: prepare completion search with defaultsort A/B test (T404858)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-10-21T07:42:36Z] <dcausse@deploy2002> Finished scap sync-world: Backport for [[gerrit:1196064|cirrus: prepare completion search with defaultsort A/B test (T404858)]] (duration: 09m 58s)

Change #1197642 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/mediawiki-config@master] cirrus: enable completion search with defaultsort A/B test

https://gerrit.wikimedia.org/r/1197642

Change #1197642 merged by jenkins-bot:

[operations/mediawiki-config@master] cirrus: enable completion search with defaultsort A/B test

https://gerrit.wikimedia.org/r/1197642

Mentioned in SAL (#wikimedia-operations) [2025-10-23T07:13:27Z] <dcausse@deploy2002> Started scap sync-world: Backport for [[gerrit:1197642|cirrus: enable completion search with defaultsort A/B test (T404858)]]

Mentioned in SAL (#wikimedia-operations) [2025-10-23T07:18:02Z] <dcausse@deploy2002> dcausse: Backport for [[gerrit:1197642|cirrus: enable completion search with defaultsort A/B test (T404858)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Change #1198291 had a related patch set uploaded (by DCausse; author: DCausse):

[mediawiki/extensions/CirrusSearch@master] CompletionSuggester: fix index id format check

https://gerrit.wikimedia.org/r/1198291

Change #1198291 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] CompletionSuggester: fix index id format check

https://gerrit.wikimedia.org/r/1198291

Change #1198529 had a related patch set uploaded (by DCausse; author: DCausse):

[mediawiki/extensions/CirrusSearch@wmf/1.45.0-wmf.24] CompletionSuggester: fix index id format check

https://gerrit.wikimedia.org/r/1198529

Change #1198529 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@wmf/1.45.0-wmf.24] CompletionSuggester: fix index id format check

https://gerrit.wikimedia.org/r/1198529

Mentioned in SAL (#wikimedia-operations) [2025-10-27T07:15:04Z] <dcausse@deploy2002> Started scap sync-world: Backport for [[gerrit:1198404|ext.xLab: Implement UnenrolledExperiment#setStream()]], [[gerrit:1198413|ext.xLab: Implement OverriddenExperiment#setStream()]], [[gerrit:1198529|CompletionSuggester: fix index id format check (T404858)]]

Mentioned in SAL (#wikimedia-operations) [2025-10-27T07:39:58Z] <dcausse@deploy2002> cjming, dcausse: Backport for [[gerrit:1198404|ext.xLab: Implement UnenrolledExperiment#setStream()]], [[gerrit:1198413|ext.xLab: Implement OverriddenExperiment#setStream()]], [[gerrit:1198529|CompletionSuggester: fix index id format check (T404858)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-10-27T08:01:18Z] <dcausse@deploy2002> Finished scap sync-world: Backport for [[gerrit:1198404|ext.xLab: Implement UnenrolledExperiment#setStream()]], [[gerrit:1198413|ext.xLab: Implement OverriddenExperiment#setStream()]], [[gerrit:1198529|CompletionSuggester: fix index id format check (T404858)]] (duration: 46m 13s)

Change #1202086 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/mediawiki-config@master] cirrus: enable default_sort on en, fr and he wikipedias

https://gerrit.wikimedia.org/r/1202086

Change #1202094 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/mediawiki-config@master] cirrus: enable alt index with default_sort on a set of wikis

https://gerrit.wikimedia.org/r/1202094

Change #1202086 merged by jenkins-bot:

[operations/mediawiki-config@master] cirrus: enable default_sort on en, fr and he wikipedias

https://gerrit.wikimedia.org/r/1202086

Change #1202094 merged by jenkins-bot:

[operations/mediawiki-config@master] cirrus: enable alt index with default_sort on a set of wikis

https://gerrit.wikimedia.org/r/1202094

Mentioned in SAL (#wikimedia-operations) [2025-11-06T08:29:44Z] <dcausse@deploy2002> Started scap sync-world: Backport for [[gerrit:1202375|"hide logged in users" is no longer working with "non-JavaScript interface" (T409157)]], [[gerrit:1202086|cirrus: enable default_sort on en, fr and he wikipedias (T404858)]], [[gerrit:1202094|cirrus: enable alt index with default_sort on a set of wikis (T404858)]]

Mentioned in SAL (#wikimedia-operations) [2025-11-06T08:32:14Z] <dcausse@deploy2002> dcausse, tstarling: Backport for [[gerrit:1202375|"hide logged in users" is no longer working with "non-JavaScript interface" (T409157)]], [[gerrit:1202086|cirrus: enable default_sort on en, fr and he wikipedias (T404858)]], [[gerrit:1202094|cirrus: enable alt index with default_sort on a set of wikis (T404858)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes c

Mentioned in SAL (#wikimedia-operations) [2025-11-06T08:42:33Z] <dcausse@deploy2002> Finished scap sync-world: Backport for [[gerrit:1202375|"hide logged in users" is no longer working with "non-JavaScript interface" (T409157)]], [[gerrit:1202086|cirrus: enable default_sort on en, fr and he wikipedias (T404858)]], [[gerrit:1202094|cirrus: enable alt index with default_sort on a set of wikis (T404858)]] (duration: 12m 49s)

Change #1203046 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/mediawiki-config@master] cirrus: start A/B test on completion with default_sort

https://gerrit.wikimedia.org/r/1203046

Change #1203046 merged by jenkins-bot:

[operations/mediawiki-config@master] cirrus: start A/B test on completion with default_sort

https://gerrit.wikimedia.org/r/1203046

Mentioned in SAL (#wikimedia-operations) [2025-11-12T08:31:54Z] <dcausse@deploy2002> Started scap sync-world: Backport for [[gerrit:1204099|Throttle exemption for Edit-a-thon in Hong Kong - 15 November 2025 (T409852)]], [[gerrit:1203046|cirrus: start A/B test on completion with default_sort (T404858)]]

Mentioned in SAL (#wikimedia-operations) [2025-11-12T08:34:14Z] <dcausse@deploy2002> dcausse, superpes: Backport for [[gerrit:1204099|Throttle exemption for Edit-a-thon in Hong Kong - 15 November 2025 (T409852)]], [[gerrit:1203046|cirrus: start A/B test on completion with default_sort (T404858)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-11-12T08:44:52Z] <dcausse@deploy2002> Finished scap sync-world: Backport for [[gerrit:1204099|Throttle exemption for Edit-a-thon in Hong Kong - 15 November 2025 (T409852)]], [[gerrit:1203046|cirrus: start A/B test on completion with default_sort (T404858)]] (duration: 12m 57s)

Change #1207758 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/mediawiki-config@master] cirrus: enable default_sort for completion on a set of wikis

https://gerrit.wikimedia.org/r/1207758

Change #1207744 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/mediawiki-config@master] Revert "cirrus: start A/B test on completion with default_sort"

https://gerrit.wikimedia.org/r/1207744

Change #1207744 merged by jenkins-bot:

[operations/mediawiki-config@master] Revert "cirrus: start A/B test on completion with default_sort"

https://gerrit.wikimedia.org/r/1207744

Mentioned in SAL (#wikimedia-operations) [2025-11-20T08:04:16Z] <dcausse@deploy2002> Started scap sync-world: Backport for [[gerrit:1207744|Revert "cirrus: start A/B test on completion with default_sort" (T404858)]]

Mentioned in SAL (#wikimedia-operations) [2025-11-20T08:09:10Z] <dcausse@deploy2002> dcausse: Backport for [[gerrit:1207744|Revert "cirrus: start A/B test on completion with default_sort" (T404858)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-11-20T08:17:10Z] <dcausse@deploy2002> Finished scap sync-world: Backport for [[gerrit:1207744|Revert "cirrus: start A/B test on completion with default_sort" (T404858)]] (duration: 12m 54s)

Looking at the en/fr/he report and the 20-wiki report.

Questions & comments:

  • [Oops. I didn't realize that the 20-wiki report did not include en/fr/he and was 20 additional wikis. Some of my comments make no sense now, so I will redact them.]
  • The 3-wiki sample is much larger than the 20-wiki sample over 7 days, so I assume the 20-wiki sample has a much lower sample rate for the A/B test. Can that be made explicit?
  • Because of the different sample sizes, the "Autocomplete submit rate" comments like "This represents approximately n more submits each day" don't make sense. 3-wiki en has a smaller diff percentage-wise (0.02% vs 0.03%), but a higher number of additional submits per day (300 vs 100). Dividing the number of additional submits by the test sample percentage will probably give the true(ish) number (I didn't look up the math behind the numbers that are there).
    • Oh, yeah, this is a bigger issue in the conclusion. In "Autocomplete Success Rate" section, in the 3-wiki report 0.27% increase == 4,500 more submissions, while in the 20-wiki report 0.34% increase == 1,400 more. That does not compute.
  • In the conclusion, I'd advocate for being more forceful in recommending we enable this. We 100% know that certain good suggestions cannot be made without this feature enabled. The report shows that overall it has no obvious detrimental effect, so it's worth it to help those people looking for the Spinozas & Gausses who aren't well-known enough to have earned explicit last name redirects.
    • I say "We may want to enable this feature on these wikis" ==> "We should enable this feature on these wikis." (Modulo my feature request for a breakdown by wiki. The Appendix relieves my concerns here.. more below.)

Minor questions & comments:

  • In the intro, does it make sense to mention the kind of default_sort data that is ignored. IIRC, flipping the default_sort at the comma has to be a prefix of the title, or it gets ignored, right? An example that gets ignored would be nice, too, but not required.
    • This also means that we don't need to know the words for "List of" in other languages, and other patterns we don't explicitly know about—like a hypothetical "Table of"—will also be handled.
  • I might argue in this case that the very slight decrease in click position might represent new results. Without default_sort, "Smith, John" isn't going to match "John Smith" at all.. so getting it into the top 10 (or more likely the top 3) is good. You'd have to see (normalized?) click counts rather than percentages at each position. Possibly more work than it is worth, since the diffs are very small.

Trivial questions & comments:

  • Several sentences end in two periods.. I actually love to do that (I call it a "⅔ ellipsis"), but people tell me I shouldn't.. lol, like they can stop me.. not something that needs to be fixed at all..
  • In "Mean Characters Typed Per Successful Lookup", after the comment "An improved autocomplete should see the number of characters decline, or at least not increase." you could add something like "A change of less than 0.1 characters per lookup is probably not detectable by most users." We have talked about that, but we don't write it down.
    • Similar thing in "Click Position". It would be redundant to repeat it everywhere. Would it make sense to explain (and could we clearly explain) how a 0.1 change in mean is roughly equivalent to a change of 1 for 1/10 of users? (And why is the trivialities section going to end up the longest?)
  • One instance of snigficant.

Feature Requests (feel free to totally ignore):

  • I'd love to see any and all stats broken down by wiki in the 20-sample report. I know it's a lot, but I swear I'll look at every graph.
    • I'm curious if different practices or language/script details make a difference. (Like, ruwiki puts titles as lastname, firstname already... so maybe default_sort does nothing there)
      • I thought of a crazy hypothetical—what if some language uses, say, a different case for "List of" when it's at the end and screwed up the validation.. that'd be wild and unexpected, but languages are crazy sometimes.
    • Also, one of the smaller of the 20 could be doing poorly and it would be swamped by the others. The Appendix indirectly addresses this. If one or two wikis were actually getting clobbered, the difference would be statistically significant. I love the Appendix.
  • Do we have a central repo of report code? If not, making changes like deduping periods and adding extra comments here where it won't propagate to future reports probably isn't worth it.
  • A more generic idea... it would be interesting (to me) to see a smaller A/A and B/B test in the reports (or maybe as a different report entirely), where we split the A data in half and compared it to itself, and the same for the B data. It would give us at least some sense of how much variability there is in the samples and possibly for the kind of data we are collecting (is autocomplete more variable than fulltext, for example).

dcausse merged https://gitlab.wikimedia.org/repos/search-platform/notebooks/-/merge_requests/13

A/B test report for default_sort in autocomplete on en, fr and he wikis

@TJones thanks for the feedback and suggestions!
We have a repo at https://gitlab.wikimedia.org/repos/search-platform/notebooks but even if some notebooks are re-usable you sometimes have to duplicate and adapt it to run your analysis. The double period issue is I think me misusing some of the templates created by Erik.
I tried to update the last report to include more detailed info (per wiki graphs) but unfortunately I waited too long and some of the data is already gone thanks to our retention policy...
A/A test is a nice idea, haven't had the chance to do it but there's possibly a way to do it from the data we have? Some wikis have a real low volume (from what I remember bi, gv and igk had fewer than 100 observations in a week) and where we might want to double check https://foundation.wikimedia.org/wiki/Legal:Data_publication_guidelines and possibly start using thresholds rather than actual numbers.

Change #1207758 merged by jenkins-bot:

[operations/mediawiki-config@master] cirrus: enable default_sort for completion on a set of wikis

https://gerrit.wikimedia.org/r/1207758

Mentioned in SAL (#wikimedia-operations) [2026-02-23T08:06:07Z] <mszwarc@deploy2002> Started scap sync-world: Backport for [[gerrit:1240892|Ensure that sysops don't have '(oathauth-recover-for-user)' right (T417877)]], [[gerrit:1207758|cirrus: enable default_sort for completion on a set of wikis (T404858)]]

Mentioned in SAL (#wikimedia-operations) [2026-02-23T08:29:43Z] <mszwarc@deploy2002> mszwarc, dcausse: Backport for [[gerrit:1240892|Ensure that sysops don't have '(oathauth-recover-for-user)' right (T417877)]], [[gerrit:1207758|cirrus: enable default_sort for completion on a set of wikis (T404858)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2026-02-23T08:43:14Z] <mszwarc@deploy2002> Finished scap sync-world: Backport for [[gerrit:1240892|Ensure that sysops don't have '(oathauth-recover-for-user)' right (T417877)]], [[gerrit:1207758|cirrus: enable default_sort for completion on a set of wikis (T404858)]] (duration: 37m 07s)