Page MenuHomePhabricator

Design structure for UI-less experiment comparing API recommendation quality
Closed, ResolvedPublicSpike

Description

Background

  • We would like to test which recommendation APIs are most helpful to users. This task will track the experimental design for this test.

User story

  • As a product team, we want to be able to compare how useful recoomendation APIs are to readers, so that we know which ones to continue experimenting with for the future

Requirements

Acceptance criteria

  1. Determine experimental design based on the following questions
  2. What tool do we use to compare just the content of the APIs?
  3. At what scale will usage of these tools be useful?
  4. Which pages/types of pages do we want to run the experiment on?
  5. How do we want to sample users/articles?
  6. Which parts of the experiment do we want to test (i.e. do we want to experiment with different number of recommendation items? Do we want to mix recommendations?
  7. Get data for experiment

This task was created by Version 1.2.0 of the Web team task template using phabulous

Event Timeline

ovasileva triaged this task as High priority.

From meeting with @JScherer-WMF and @NBaca-WMF

Decision: Use google form for the survey itself, recruit on userlytics
Decision: We generate 3 different articles to show users (that they can select their most interested based on the topic). “How likely are you to read about this topic on your own outside of this study”
Potential idea: List of articles to choose from:
An interesting “did you know”
Something trending/in the news
Evergreen - Moon, Mars, etc
Potential idea: Paper
Decision: Should screen for intrinsic learning first
Decision: We will show them a Wikipedia page screenshot and then ask the questions on top of it. We will make an API call for each of the three articles and manually populate the table that the form is generated from. For the test itself we will de-duplicate and then go back to the original table to compare results. Any vote for a duplicate would be a point for each of the APIs the suggestion came from.
Decision: Use quicksurveys

Action items:

  • Olga to start protocol draft
  • Olga to make a ticket about setting up quicksurvey
    • One survey for UI experiment
    • One survey for non-UI experiment
    • Maybe two languages?
  • Justin to then improve protocol get feedback on protocol
  • Olga to make a ticket about getting a privacy policy

We need to through articles after 2020, Kim has been helping Justin look into this.
Justin and Olga have a 1-on-1 tomorrow to go through this, which will help us wrap it up this week.