Switch Portal EL & survey sampling algorithm to use seeded RNG
Closed, ResolvedPublic

Description

The current algorithm converts the (randomly generated) event logging session ID to an integer and then checks if it is divisible by N where 1-in-N is the sampling rate we want. For example, 0.5% rate is 1 in 200. This effectively means that we can't use anything that is a factor of N (e.g. 50, 10, 100, 40 for 200) for subsequent sampling, as was the case in the recent survey banner situation.

We should try moving to a seeded random number generation (e.g. https://commons.wikimedia.org/wiki/MediaWiki:Gadget-math.seedrandom.js) that allows us to set the seed to that same (randomly) generated session ID and then use the traditional method for getting random numbers between 1 and N, which give us very easily understood sampling code:

// assume the seed has been set to session ID
function oneIn(N) {
    return(Math.floor((Math.seededrandom() * N) + 1))
}
if (oneIn(200) == 1) {
  // selected for event logging
  if (oneIn(10) == 1) {
    // selected for A/B testing
    if (oneIn(2) == 1) {
      // selected for the control bucket
    } else {
      // selected for the test bucket
    }
  } else {
    // rejected from A/B testing, but still enrolled in EL
  }
} else if (oneIn(50) == 1) {
  // rejected from EL but selected for survey banner
} else {
  // rejected from EL and survey banner
}

The logic would play out the same every time the page is refreshed as long as the user has the same session ID.

mpopov created this task.May 17 2016, 8:48 PM
Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptMay 17 2016, 8:48 PM
debt triaged this task as Normal priority.May 17 2016, 9:43 PM
debt edited projects, added Discovery-Portal-Sprint; removed Discovery-Portal-Backlog.
debt assigned this task to Jdrewniak.
debt added a subscriber: Jdrewniak.
debt moved this task from In Progress to Done on the Discovery-Portal-Sprint board.Jun 9 2016, 9:04 PM
debt closed this task as Resolved.Jun 14 2016, 11:25 PM
debt moved this task from Done to Completed on the Discovery-Portal-Sprint board.Aug 12 2016, 8:02 PM