- Control: Has default phrase rescore boost of 10
- Test: Has phrase rescore boost of 1
- Bucketing/sampling: Satisfaction schema sampling increse from 0.5% to 1%. 50% of the 1% go into the Test bucket.
- Measuring:
- number of searches per session (Should we assume ↓ is good? – MP)
- clickthrough rate (↑ is good) (P.S. this is inverse of session abandonment)
- position of first clicked result (I don't expect this to change since vast majority of clicks are already on the first result – MP)
- time to first clickthrough (↓ is good)
- search session duration (we should investigate the impact, but not assign a pos/neg judgement to either direction; can elaborate if needed – MP)
- number of results visited (as with search session duration, we should see how this is impacted, but conflicting goals make make it hard to assign a pos/neg judgement – MP)
- time spent on visited pages ("hazard" in survival analysis, similar concern as with search session and number of results visited)
- Duration: one week
More context for this task is available in T129593.