Please respond to the following questions, and provide as much detail as possible for each.
Scoping details
- Use case: Describe the user-facing experience(s) that this model will serve. Who is the intended audience? Where and how will the model outputs be surfaced to users? Please feel free to link to any demos, prototypes, design files, etc.
Readers are looking for low-friction ways to get information, and we are interested in hands-free audio features to meet that opportunity. Two use cases we envision: (1) As a reader, I can tap/click a button and have Wikipedia read an article to me, with navigation controls (including skip ahead/back [10s], skip to section, [0.75-2]x speed). (2) As a reader, I can speak a question into the in-article search have Wikipedia read back the answer to me as it jumps to the highlighted answer within the article.
- Model purpose: What should the model do? What does it need to predict or generate?
Model should generate audio (speech) at varying speeds from article content as well as timestamps that will allow navigation.
- Goal: What's the goal of this user experience? What patterns in user behavior do we want to impact? What metrics will let us know we're successful?
We want to help readers see that Wikipedia is an easy place to find information and thereby encourage them to come back more.
North star metric is 21d logged-out reader retention. We'd also look at usage metrics, e.g., click-through.
- Prior art: How much of the UI for this experience has already been developed and/or tested? Are there any previous models or manually-created rules that we can learn from?
I think Apps previously did an early exploration with ElevenLabs on a version of this feature that didn't pass muster.
Prioritization details
- Timing: When are you hoping to launch an experiment or feature using this model? How flexible is your timeline? Is there any other planned work that's blocked by this experiment or feature?
FY26-27, flexible and would love to partner to determine work-back timing together
- KR impact: Which KRs are enabled by this project, and how critical is this project for moving the needle on those KRs?
OW3.1
Other comments
- [Optional] Model requirements: If you have any specific concerns around model performance (latency, cost, etc.) or model output quality (likelihood of false positives, ability to detect all possible instances, etc.), please note them here.
- [Optional] Is there anything else you'd like to share?