Page MenuHomePhabricator

Measure the impact of MinT for Wiki Readers MVP
Open, MediumPublic

Assigned To
Authored By
Pginer-WMF
Sep 3 2024, 11:21 AM
Referenced Files
F57777637: image.png
Dec 4 2024, 8:07 AM
F57777578: image.png
Dec 4 2024, 7:32 AM
F57777547: image.png
Dec 4 2024, 7:32 AM
F57777545: image.png
Dec 4 2024, 7:32 AM
F57777541: image.png
Dec 4 2024, 7:32 AM
F57777500: image.png
Dec 4 2024, 7:32 AM
F57777495: image.png
Dec 4 2024, 7:32 AM
F57777477: image.png
Dec 4 2024, 7:32 AM

Description

As part of the work on MinT for Wiki Readers MVP (T359072), we want to learn about its impact.
This ticket proposes to analyze the available data to identify signs that show both its usefulness to users and any potential problematic consequences based on the data we capture from the 23 pilot Wikipedias.

An initial analysis after a high peak of activity was produced when the entry point at the article footer (T363338) was made available (disabling it later to avoid servers going over capacity). We want to make sure that, the article footer entry point has been restored before completing the analysis in the current ticket.

As part of this ticket, we may want to explore some relevant questions:

  • User interest. How many users access machine-translated articles, and which percentage of the mobile web traffic for their wiki do they represent (our initial target was to reach 3%).
  • Negative effect in other reading activities. Does reading translations result in reading fewer articles on the wiki?
  • Negative effect in other editing activities. Does the availability of translations result in less editing activity to create articles in the local wiki?
  • Funnel overview. How is their usage pattern. Do they spend time reading several sections of the article, do they navigate to other articles or translations? How much time do users spend in a machine translated article, and how this compares to the time spent in other articles?
  • Content discovery. Which is the difference of coverage between the translation consumed and the local article available (e.g., MT may be used more on articles that exist but are less than 5 paragraphs, compared to missing articles or longer ones)? Do people select different source languages based on their content available? Which kind of topics do users read as machine translations, which coverage provides the wiki for those, and how do those compare to non-translated articles?
  • Appeal to translators. Is this a feature only useful for readers, or is it also encouraging the creation of translations?

More context about the annual plan target:

This area of work is in support of the Wikimedia Foundation Product & Technology Objective and Key Results (OKRs) of the 2023-24 period. Although the MVP was launched at the end of such period, the idea was to leave time for users to use the feature before assessing the impact. In particular, the target was defined for the key result WE2.2 in the following hypothesis:

Scaling Open Translation service will increase page interactions from underserved communities
Exposing our machine translation service to support the automatic translation of Wikipedia article contents. Make the service available for 10 languages not supported by commercial vendors such as Google Translate. This will increase page interactions from underserved communities by making millions of Wikipedia articles more accessible with automatic translation.
Success will be measured after 6 months by a 3% increase in page interactions on the pilot wikis driven by the views to automatic translations of content.


This ticket is focused on measurements. Related tickets enable direct user input:

Event Timeline

Pginer-WMF triaged this task as Medium priority.Sep 3 2024, 11:21 AM
Pginer-WMF created this task.
Pginer-WMF moved this task from Backlog to Product integration on the MinT board.
Pginer-WMF added a subscriber: PWaigi-WMF.

Please help me to understand the measurement plans here. Using a real example will help us in getting clarity.

  • The Irak article in ff.wikipedia.org is a single line article with 127 bytes
  • It is the most viewed article in ff.wikipedia.org with 6,483 pageviews in 1/1/2023 - 11/30/2024 (23 months)
  • It is recieving average 9 pageviews(User, not automatic) per day on average.
  • It has 0 edits and 0 editors for the time period of 1/1/2023 - 11/30/2024 (23 months)

image.png (777×1 px, 431 KB)

Suppose we add entrypoints for MinT for Readers in the Irak page. The entrypoint is a footer link to access machine translated version.
Can you explain what are we measuring here?

  • User interest. How many users access machine-translated articles, and which percentage of the mobile web traffic for their wiki do they represent (our initial target was to reach 3%).
  • Will there be an increase in number of page views to this page, indicating people are accessing the onelne page so that they can read English(example) version with more content by clicking on the footer link? For unknown reasons, the 'Irak' articles viewership is declining from last year(16 page views to 9 page views per day).
  • If so, please provide some example numbers indicating the user interest

Negative effect in other reading activities. Does reading translations result in reading fewer articles on the wiki?

Since 'Irak' article is most read article in ff.wikipedia.org, decrease in average page views would indicate Negative effect? I understand that single article cannot be used as a measure as proxy for entire wiki. So let us look at the pageviews for the entire wiki:

image.png (639×1 px, 405 KB)

We see that there are 2209 average per day page views in the wiki for the last 23 months. It shows the readership is increasing in small scale but with fluctuations. Since there can be many reasons for ups and dowsn in this graph, we need to find a way to attribute ups/downs that are related to availability of synthetic content. That sounds hard to measure. What would be a practical and sensible way to measure it?

Negative effect in other editing activities. Does the availability of translations result in less editing activity to create articles in the local wiki.

Currently there are 0 edits or 0 editors for Irak article in ff.wikipedia.org for past 23 months. Anything other than 0 would be very nice.

There are around bar|2-year|~total|monthly | 2k monthly edits in ff.wikipedia.org these days. And it is greater than last year.

image.png (662×1 px, 68 KB)

There is an increased activity this year for creating new articles:

image.png (662×1 px, 70 KB)

Interestingly all those new articles are created using translations with CX. with high-machine-translation tags. The high number of edit activities are contributed by a user who created 2065 new articles in the first 70days of joining that wiki(30 articles per day) and another user with 1400+ edits in similar time span.

Since these unusual edit activities sums up most of the edit activities in ff.wikipedia.org, what is our expectation on increase or decrease in editor activities?

Funnel overview. How is their usage pattern. Do they spend time reading several sections of the article, do they navigate to other articles or translations?

  • If users land in the Special:AutomaticTranslation page, but leaves without clicking on 'explicit CTA' to read the translation, would that count as 'non-interest'?
  • If users land in the Special:AutomaticTranslation page, access lead article translation and does not expand any other sections, what would be interpretation?
  • If the user expands all sections in the Special:AutomaticTranslation page, I guess that clearly indicates the time spent on the page, indicating the interest.
  • How do we measure navigating to other articles from Special:AutomaticTranslation? Do we have instrumentation for that?
  • Since there is no edit button in the MinT for wiki readers page, we won't know if reader try to edit or correct the page. The edit icon in the top of the page might look like a call for action, but it is not a button

image.png (108×451 px, 10 KB)

  • However if user are trying to change the language on the page, there are options inviting for edit. That seems something to fix? I don't think people will change the language often and then notice the call for contributions.

image.png (72×1 px, 2 KB)

image.png (398×496 px, 21 KB)

Content discovery. Do people select different source languages based on their content available?

That is intresting thing to measure. I hope we have instrumention on this. However, the language selector is very minimal(with bugs) and has no indication on which one to chose to read more machine translated content.

There is also a question whether people chose different article now that they discover the tool.
The "Random Topic" button in the page is a dummy button though. It has no action.

image.png (474×512 px, 25 KB)

Appeal to translators. Is this a feature only useful for readers, or is it also encouraging the creation of translations?

As noted above, there is unusual translation activity in the ff.wikipedia.org without the presence of MinT for wiki readers entrypoint. I wonder how to reconcile that interest(although unusual) with the introduction of new entrypoint to MinT for readers.
For readers of machine translation to become editors - please note the above comments regarding edit options are under language selector.

Regions where these languages are spoken also need to be considered to get full picture. Nigeria for example has primary education in English and has a literary rate of 68%. How much of this literate internet accessing population who has primary language as English likes to use ff or ig or ki languages for reading encyclopedic content is very much the question here. Very low traffic to these wikis underlines this issue.

The Iraq article in ki.wikipedia.org has 2 page views per day on average this year. While Iraq was top visited article in 2023, this year, article about "Wikipedia" seems to be the top one with 28 page views per day.
ki.wikipedia.org bar|2-year|~total|monthly | has less than 10 new articles created per month with about 10 editors per month

image.png (749×1 px, 477 KB)

Very low activity wikis like this will be challenging to measure any ups or downs in activity. These wikis are basically inactive unfortunately and there are valid reasons considering the demography of communities.

Please help me to understand the measurement plans here.

The idea of the experiment is to cover a broad set of wikis and to get the data based on the activity on all articles. For example, the situation for a particular article in Fula Wikipedia may be very different from the situation in another article on Korean Wikipedia. In the initial enablement, data shows that Korean was getting over 1K times more user sessions requesting MT than Fula (4K vs. just 3). Persian got 5X+ times the requests from Korean (30K vs 4K). So demand is expected to diverge a lot by wiki, but looking at the overall numbers, as opposed to specific articles, would provide a more clear understanding of those differences. By including a broad range of wikis in the experiment, we can get a broader view of how MT is useful or not in different contexts.

This ticket is intended to capture which aspects we want to measure to identify the potential positive and negative impact of making MT available for readers in different wikis. The description is an initial brainstorm of aspects that we may want to measure to capture potential positive and negative aspects, if there are new ones to consider feel free to add. I'll defer to the analytics expert, @KCVelaga_WMF, to turn those into measurable signals and for answering the specific questions.