* **Control**: Has default phrase rescore boost of 10
* **Test**: Has phrase rescore boost of 1
* **Bucketing/sampling**: Satisfaction schema sampling increse from 0.5% to 1%. 50% of the 1% go into the Test bucket.
* **Measuring**:
** number of searches per session (Should we assume ↓ is good? – MP)
** clickthrough rate (↑ is good) (P.S. this is inverse of session abandonment)
** position of first clicked result (I don't expect this to change since vast majority of clicks are already on the first result – MP)
** time to first clickthrough (↓ is good)
** search session duration (we should investigate the impact, but not assign a pos/neg judgement to either direction; can elaborate if needed – MP)
** number of results visited (as with search session duration, we should see how this is impacted, but conflicting goals make make it hard to assign a pos/neg judgement – MP)
** time spent on visited pages ("hazard" in survival analysis, similar concern as with search session and number of results visited)
* **Duration**: one week
More context for this task is available in T129593.