Page MenuHomePhabricator

Define success criteria for performing A/B tests
Closed, DeclinedPublic

Description

Goal: Increase user engagement in Wikipedia Android app.

  • Time buckets: 2 weeks (or more if possible) for A/B test, one week before (and one week after) as control group
  • All results need to be checked for statistical significance

Performance measures:

  • CTR per recommendation source
  • Avg. session length, pages per recommendation source
  • Detail evaluation: In order to investigate in depth differences of the recommender systems, article properties (length, category, inbound links, popularity, ...) should be evaluated as well.

Use data from EL schemas: https://meta.wikimedia.org/wiki/Research:Schemas

  • Aside schema specific data (timeSpent, readMoreList, ...) the evaluation requires page data (pageTitle or pageId - for article-based evaluation) and user data (appInstallId - can be anonymized).
  • Relevant schemas:
  • Combined schemas:
    • long-CTR: time spent or maxPercentViewed of recommended page (check for equal pageId and appInstallId)
    • recommended-shares: Shares of pages that have been recommended before (in a 2 day time frame)
    • recommended-session: number of totalPages per session, session length for user that clicked on a recommendations

Event Timeline

Aklapper renamed this task from Define success criteria to Define success criteria for performing A/B tests.Nov 1 2016, 6:10 PM

@mschwarzer: Hi! This task has been assigned to you a while ago. Could you maybe share an update? Do you still plan to work on this task, or do you need any help?

If this task has been resolved in the meantime: Please update the task status (via Add Action...Change Status in the dropdown menu).
If this task is not resolved and only if you do not plan to work on this task anymore: Please consider removing yourself as assignee (via Add Action...Assign / Claim in the dropdown menu): That would allow others to work on this (in theory), as others won't think that someone is already working on this. Thanks! :)

Discovery team: Ping - is this still planned? Has this ever happened?