Page MenuHomePhabricator

ga.wikipedia - change content translation tools to be enabled for Extended Confirmed users only
Open, Stalled, Needs TriagePublicRequest



I'm global User:Alison, an admin and bureaucrat on the Irish language Wikipedia. I'm representing the community here today, and conveying the wishes of the ga.wikipedia editors to make some changes to the Content Translation settings on our wiki.

We have !Voted and conclude that we would like that content translation tool should be limited to Extended Confirmed users only.

Best regards,

  • Allie Cassidy

Local discussion:éid%3AHalla_baile&diff=1133474&oldid=1133457#Content_translation_tools_/_machine_translation_-_community_poll

Event Timeline

Change 912805 had a related patch set uploaded (by MarcoAurelio; author: MarcoAurelio):

[operations/mediawiki-config@master] [gawiki] Restrict CX publishing to NS_MAIN to extendedconfirmed only

Hello @Alison: gawiki does not have an extendedconfirmed user group (see group list, just noticed after creating the patch). Did you mean autoconfirmed instead? Otherwise gawiki would have to discuss the creation of the extendedconfirmed group first.

Hi Marco. Thanks for the update. I could go back to the community and request the extended confirmed option, though it's going to be a pain to put them back through the process again. I think Auto-confirmed is a bit of a low bar, though.

I'll bring it back to the community for discussion. If we need extended confirmed, can we roll this into the same ticket here?

  • Allie

Hello @Alison. I think two tickets would be better as they are technically two different requests (create the extended confirmed group and restrict ContentTranslation publishing) but we can try and deploy both requests the same day in a chain. Best regards.

Out of curiosity are there any implications or other effects of having an extended confirmed group ? And is move into the extended confirmed group automatic after 500 edits.
DeirgeDel (formerly Djm-leighpark)

@MarcoAurelio Please also add the ContentTranslation tag in the future, so the Language team is aware of any technical or non-technical issues the community faces.

Out of curiosity are there any implications or other effects of having an extended confirmed group ?

@DeirgeDel Hello. It means having another user group, which gawiki will have to maintain and decide which permissions the group will have as well. I'm going to ping @Urbanecm since he can provide more accurate advice here. As far as I can see, you'd like to use this group only to restrict CX use and no other purpose. @KartikMistry Are there any alternatives for t without creating a new user group?

And is move into the extended confirmed group automatic after 500 edits.

No. If you wish to configure this user right as an autopromote group, gawiki will need to tell us the specific conditions so we can configure it. A list of these can be found here.

Thanks for the information guys. As background I am the primary troublemaker who (at least of recent times) caused issues by using the tool with insufficient knowledge of Gaeilge grammar, the auto-translators seemingly not being as good inbound for this VSO language as compared to e.g. English or French. I'm just under the 500 edits but could zoom up to any arbitrary edit count by working on Categories. At the same time from the conversation on :ga I know there is a use case for groups of advanced Gaeilge language students under WMF project or other supervision using the tool and having the skills to do corrections from it; albeit I at times I suspect they would struggle with using the tool itself as it does seem to have its quirks. So my personal preference would not be having it as an autopromote group. Is also interesting that in that case as ''Extended Confirmed'' does not (currently) have any other implications that may also be helpful to progressing. I hope to speak to others on :ga about this within a couple of days. Thankyou.

I have been analyzing some data to try to understand the activity patterns for the translations on Irish Wikipedia.

Looking at the users that have published a translation since 2020 by their edit count, it seems that a great majority of the translations are already driven by users with an edit count of 500 or more edits (what would be consider as "extended confirmed"). Looking at the data from the current 2023 year, 434 translators (93%) had an edit count of 500 or more edits. During 2023 there were only 31 users (7%) with an edit count below 500 edits. In fact, most of the translations are published by users with an edit count over 10K edits (orange line in the graph below)

monthly-translations-by-user-edit-count-bucket-2023-09-22T12-22-40.648Z.jpg (376×942 px, 55 KB)

The only period where users below 500 edits had a significant contribution was during June 2022, representing 41% of the translators during that month. However, the deletion rate for translations published during that period was 2%, which is not particularly high compared to the usual values captured in the graph below. This seems to suggest that the less experienced users don't seem a clear source of low quality translations.

For reference, on Irish Wikipedia since 2020 the deletion rate for articles created with Content Translation is much smaller than for articles that are created from scratch: 1.3% of the translations created with Content Translation are deleted, while 5.4% of articles created without content translation. So it is a bit unclear which is the rationale to limit the possibility to translate to newcomers, leaving them as an alternative a path where they are less likely to succeed.

monthly-rate-of-deleted-translations-2023-09-22T12-22-10.051Z.jpg (400×987 px, 52 KB)

Looking at the number of translations over time, there seems to be a spike of activity in the June-September 2022 period where less experienced users participated more than usual. It would be interesting to hear if anyone in the community has a sense of what may have generated such spike (in some cases these are the result of contents or similar events)

monthly-translations-at-top-10-wikipedias-2023-09-22T12-22-02.543Z.jpg (376×987 px, 43 KB)

In conclusion, the data does not suggest that preventing access to the tool to less experienced users will have a big and/or positive impact. I'm ok for communities to adjust the tools in the best way to suit their needs, but I think the above information can be useful to have a better shared understanding on the translation activity at large. Also, data may not be telling the whole story, so I'm totally happy to hear more about how translaiton tools are working from the community perspective and how to better support them.

MarcoAurelio changed the task status from Open to Stalled.Sep 28 2023, 10:44 AM
MarcoAurelio removed MarcoAurelio as the assignee of this task.

Removing myself as assignee and setting as stalled pending community discussion and Pau's analysis.

I'd like to get this issue back on track if possible. Alison had originally been shepherding it through this process, but she's been quiet on our Wikipedia the last several months.

I'm grateful for Pau's analysis above, but it indeed doesn't tell the full story. For example, deletion rate isn't a particularly good metric for us, certainly not historically. Up until a couple of years ago, the only articles the admins of the Irish Wikipedia would delete, more or less, were those that were clearly spam. A lot of machine translated content was left undeleted/unchanged. I delete those articles now when I come across them but there's a lot of pollution on the site. I think there's also a tendency among the experienced editors to work to bring new articles, even those clearly machine-translated, up to an acceptable standard instead of deleting. The upshot is that this creates a burden on the very small number of editors who have good enough Irish to do this kind of work — this is what led to the vote to disable the translation tool for newcomers.

What do I need to do to get this moving again?