Page MenuHomePhabricator

Enable Content Translation in Catalan Wikipedia for all logged-in users
Open, HighPublic

Description

Content translation has been used by the Catalan Wikipedia community to create more than 30,000 articles which seems to indicate that the tool is useful for those that enabled the beta feature. It seems reasonable to expose the tool to more users in the future by making it available by default to logged-in users.

Before we provide access to the tool by default, we want to identify which are the main blockers based on community feedback. We'll keep working on improving the tool in many different areas, but it is important for us to identify when the community considers the tool ready to get more exposure. So feel free to suggest key issues you anticipate a negative impact for if the tool was made available in the current state.

Related Objects

Event Timeline

Pginer-WMF raised the priority of this task from to Needs Triage.
Pginer-WMF updated the task description. (Show Details)
Pginer-WMF added a project: ContentTranslation.
Pginer-WMF subscribed.

I created a discussion topic in the Project page and a post on the local village pump in order to start the conversation.

I think we should fix this bugs that make wikitext dirty before users that won't revise the resulting code use this functionality.

I agree with Gerardduenas, these little errors should be fixed, especially:

  • the red internal links
  • the internal links which are removed or which are moved without any sense; also, sometimes they include the final dot of a sentence, or nearby characters which don't have to be included
  • the translation of internal links, i.e. yesterday I translated an article which had the internal link "E Pluribus Unum" (in Spanish), then in catalan, even though the link pointed to the correct article "E Pluribus Unum", it translated the text into "I Pluribus Unum". This happens with many links; if the article exists, I cannot see why the link text cannot be grabbed from the article name

Besides that, I love the tool :)

@Gerardduenas and @Arnaugir thanks for the feedback

I think we should fix this bugs that make wikitext dirty before users that won't revise the resulting code use this functionality.

I see that you added a relevant bug as a blocker, if we identify similar issues we should add them too.

the red internal links

T78133 and T78695 should facilitate the creation of red links. I added them to the list.

the internal links which are removed or which are moved without any sense

T90718 should make already the link infrastructure more solid. If we identify new issues, we can capture them in new tickets.

the translation of internal links

The problem with using the article title is that it lacks the sentence context which the translation service is supposed to take into account (although the automatic translation can also fail in some cases). For example, an article containing a link "apples" that points to the "Apple" article in English is expected to be translated as "pomes" (in plural) in Catalan and point to the "Poma" (singular) article. If we just use the article name ("Poma") the user has to edit it every time to match the sentence context (make it in the plural form).
This is just an example, but other particularities of titles such as disambiguation clarifications (e.g., "Orange (color)") or scientific names (e.g., "tomato" text becoming the latin term "Solanum lycopersicum" when translated to Spanish) can also be problematic.

Unless we find a smart way to use the title info to correct the initial automatic translation, I think we can only rely on the translation service for now.

Gràcies Pau

especially for the answer on "the translation of internal links". I will think about it more thoroughly.

Based on the discussion on Catalan Wikipedia, I can summarise the main points that were mentioned and the relevant tickets I could find or create in Phabricator):

  • Infoboxes. Two aspects were mentioned. On the one hand, that it would be great to have (didn't sounded like a blocker, but something that will save time since adapting it manually happens often), On the other hand, it was mentioned that references on infoboxes got lost, resulting into traces of code in the content and affecting the rest of the references (we may want to reproduce what happens in that context since I have not much more detail). I created a tracker ticket for better template support (T102964) since there seems to be missing a central ticket to group the different issues reported in that regard.
  • Add a template to mark the page as a translation. We knew that different wikis add different templates in the article talk page. I added the details for catalan in the relevant ticket (T98126).
  • Link modifications and "nowiki" tags. That is already expected to be improved by several tickets making links more reliable (T78133, T78695, T95271).
  • Continue published translations. Users expect to keep using Content Translation after the initial publication. I created a ticket (T102966) to allow it when there is no potential conflict.
  • Mark article as work in progress. There were some concerns about the need for room to improve the translation after initial publishing. I asked for some more details since we can approach this from different perspectives (encourage to edit in-progress translations more, suggest to add an "in progress" template after publishing, etc. )

Based on recent testing, it seems that the process of mapping fails for templates when metadata is in the documentation tab, which seems to be common in Catalan Wikipedia. Reported at T219346: CX2: TemplateData is ignored when it is provided in the "documentation" tab