Page MenuHomePhabricator

Analyze the Impact of MinT for Wiki Readers on Pre-pilot wikis
Closed, ResolvedPublic

Description

As part of the MinT for Wiki Readers experiment, the plan is to have the feature exposed on 4 wikis:T390023 before running a full A/B test on 13 wikis: T388402.

This ticket proposes to perform some checks and analyses on these 4 wikis, which would help make any necessary adjustments before the A/B experiment phase:

  • Key events and instrumentation; if we are correctly capturing all the user interactions.
  • Funnel overview from the different entry points; to determine if there are any user drop-offs.
  • MT response time; to determine if users are getting the optimum experience from the feature. This would also guide an additional increase in server capacity (T386371).
  • Feedback from the survey (T381886) for early insights about the user experience.

Event Timeline

PWaigi-WMF renamed this task from Analyze the impact of MinT for Wiki Readers with pre-pilot wikis to Analyze the Impact of MinT for Wiki Readers on Pre-pilot wikis .Apr 8 2025, 3:19 PM
PWaigi-WMF updated the task description. (Show Details)
KCVelaga_WMF changed the task status from Open to In Progress.May 19 2025, 8:21 AM
KCVelaga_WMF claimed this task.
KCVelaga_WMF triaged this task as Medium priority.
KCVelaga_WMF moved this task from Incoming to In progress on the LPL Analytics board.
KCVelaga_WMF added a subscriber: Pginer-WMF.

Key events and instrumentation; if we are correctly capturing all the user interactions.

Yes, all events are being captured correctly. We might have to log an additional event to breakdown session initiation into two or more steps. I will create a ticket after discussing with @Pginer-WMF.


newplot.png (525×864 px, 88 KB)

  • Overall, Asturian Wikipedia showed the most interaction with the feature, both in terms of overall events logged and views to machine translated and human created content.
    • It is also worth noting that, at the time of deployment, only Asturian Wikipedia had 100% localization of the feature (T390043#10724728), and the largest almost all the four pre-pilot wikis.
  • There are various types of views that can be considered as part of the feature:
    • Initial view: View to machine translated content when readers first select an article to view and initial content is loaded, which can include the lead section along with 2-3 additional sections.
    • Section expansion view: Readers views additional content by expanding a section (which was previously collapsed).
    • Referred views to existing articles: Readers click to view human created content and visit an existing article (outside the feature).
  • For the period observed, various types of pageviews generated through the feature only contributed a minute percentage of the overall mobile user page views received to the respective Wikipedias. However, it is worth noting some unusual spikes in mobile user pageviews to these sites during the same period, which should be further explored.

Usage of entry points

Funnel overview from the different entry points; to determine if there are any user drop-offs.

  • The article footer was the most used entry point to access the feature, followed by the language selector menu. Only a small portion of the sessions were initiated by directly opening the Special:AutomaticTranslation page.
  • In case of both, article footer and language selector entry points, about 50% of users dropped off after session initiation, without further action. The remaining half proceeded to take at least one action. Note: This doesn't necessarily mean viewing automatic translation, but could include searching for another article, changing target language etc. It only implies at least one other action taken apart from session initiation.

MT response time; to determine if users are getting the optimum experience from the feature. This would also guide an additional increase in server capacity

newplot (1).png (525×864 px, 42 KB)

  • Response time and percentage of sessions:
    • < 5 seconds: 70%
    • < 3 seconds: 32%
    • < 2 seconds: 10%
  • The average and median response times are, 4.5 and 3.75 seconds, respectively.

Feedback from the survey (T381886) for early insights about the user experience.


A more detailed breakdown of user funnel (including potential drop off points) and additional details are available in the complete analysis.

Thanks for this analysis, @KCVelaga_WMF. This is really useful. Some minor comment: for the "Survey responses" section it may be useful to have (a) an overview of the responses, and (b) a screenshot of how the survey questions look like (or a link to T381886).

Based on the data and conversations about it, there are some aspects that we may want to consider in preparation before the experiment is launched:

  • T390043#10942900 We may want to encourage localization of the feature in the pilot wikis since low percentages of localization my affect the access to the feature.
  • T386371#10943001 We may want to request some more server capacity to improve the loading times in order to avoid those to add noise to the experiment (drop-off due to slowness).
  • T397821 We may want to instrument the Confirm step so that we can identify better if there is a correlation between drop-offs and high loading times.
KCVelaga_WMF added a subscriber: cchen.