We have a couple metrics implemented in relforge that can analyze the results:
* MRR
* pFound / eFound
These know how to read the WikidataCompletionSearchClicks eventlogging schema, but that schema doesn't have any affordances for AB testing.
[ ] Add AB test bucket to eventlogging schema
[ ] Update javascript to sample users into testing buckets.
[ ] Update relforge to split metrics by test bucket
[ ] Create backend test profiles
The test should be set up this way:
* Config value that enables the whole test (in mediawiki-config)
* Front end decides on enabling the test for particular request and bucketing (so far we enable only for "en" language and items)
* Front end adds parameters to request to set test profile - `cirrusWBProfile` and `cirrusRescoreProfile`.
* Backend just uses the test profiles (which need to be set up) to deliver results
* Front end logs the test bucket together with the results in `WikidataCompletionSearchClicks`