Page MenuHomePhabricator

Enable the access to MinT for Wiki Readers MVP in 3 wikis of different sizes
Open, MediumPublic

Description

As part of the MinT for Wiki Readers MVP (T359072), the access to MinT for Wiki Readers was enabled from Wikipedia article's footer (T363338) on mobile web for 23 pilot wikis in July 2024. This resulted in a high increase of machine translation requests from users that exceeded server capacity, and this access point was disabled after a few days. Data from this period shows an increase of user daily sessions after the option to learn more about the current article from other languages was available.

Plans to explore checking server capacity (T368418), and gradually enabling the feature to affected wikis (T373890) were explored. Based on input from SRE in T370755#10210414, this ticket proposes an initial step for deploying on 3 wikipedias: 1 small, 1 medium and 1 large from that set of 23 pilot wikis.

First off, I 'd suggest we reprioritize the caching vs capacity planning parts. We know we only use 4 pods for MinT, increasing them is possible. First off it will provide more CPU to the deployment, avoiding the high throttling levels. We already kinda a have a first order approximation (~6rps) when enabling 23 languages. To err a bit on the safe side, I suggest we pick 3 wikipedias, 1 small, 1 medium and 1 large from that set of 23, enable the feature for them, let it be for 1 week and gauge the effects. I expect we 'll see way less CPU throttling and requests (and I might be proven wrong). From there we guesstimate (nothing particularly fancy, it will prove to be wrong anyways) required number of pods. This alone might solve most problems. From there on, we talk whenever we more languages are added and continue. Note that memory usage requirements are quite high (32GB per pod), so we might end up having to stall waiting for new hardware.

This represents a different approach to the gradual enablement defined in T373890 where it was proposed to start with the one with smaller mobile traffic and iterate to deploy in those with more traffic.

Proposed candidates:

  • Persian (fa)
  • Icelandic (is)
  • Fon (fon)

From the list of 23 pilot wikis, these seem to be well distributed in terms of wiki size, mobile web, demand for machine translation based on the initial exposure, and machine translation quality signals. A filtered view of the table in T373890 is shown next:

WikiMobile Web ViewsOverall Size RankNotes
Persian (fa)89.6%12
Icelandic (is)41.7%105
Fon (fon)21.8%680<- start from here

Process:

  • We can enable the feature in the three selected wikis (Fon, Icelandic, and Persian) in one go. If that causes too much load to the servers, we can consider an alternative set of wikis with a lesser large one (Fon, Icelandic, and Korean) instead, or, as a final alternative, to skip the large one (going with Fon and Icelandic only).
  • Once the three languages are enabled, we may want to leave the feature available for at least two weeks. This represents more time than the originally recommended. The reason for this is that we can use this enablement as an opportunity to start measuring the impact (T373862), and we may want to cover potential effects in editing activity for which more time may be needed on smaller editing communities.