During the 2019-20 fiscal year the Language team plans to support cross-wiki propagation of contentSection translation ( check the [[ https://commons.wikimedia.org/wiki/File:Cross-wiki_content_propagation_concept.pdf | General concept ]] and the [[ https://commons.wikimedia.org/wiki/File:Section_translation_initial_designs.pdf | initial design ideas ]]) as part of the [[ https://www.mediawiki.org/wiki/Content_translation/Boost | Translation Boost initiative ]]. That is, supporting users to translate relevant pieces of content that are missexpand existing articles by translating in the existing article on their locala new section from another language (on desktop and mobile)e. For example, we want to make it easy for a user to translate and transfer the "Construction sequence" information from the English version of the "Suspension bridge" article into the Portuguese version of the article (where it is missing)to expand the "Ukulele" article in Tagalog by translating the "history" section from English.
This ticket provides an overview of current and future research work that could help in this context. The areas below describe the support needed, in which ways such support would help, and fallback approaches that can be applied while the necessary capabilities are not yet available.
# Template parameter mappings
Templates capture relevant contents such as references and infoboxes. Transferring such content across languages is challenging since templates are defined independently in each wiki. Lack of support results in incomplete content and requires additional efforts by editors.
Research in this area can help to map template parameters automatically. Automating this mapping allows to easily transfer the content structured in templates across languages. For example, when translating a new paragraph about the latest scientific discovery, the reference information will be kept in the translation with all its information (book title, page number, etc.) resulting in content of a higher quality.
**Status and fallbacks**
Work already started to explore how to map automatically template parameters for popular templates using machine learning ({T221211})
For templates that cannot be supported with this approach, we can consider the following fallback approaches:
- Ignore unsupported templates. Remove the unsupported templates when presenting the content to translate to the user
- Highlighting the content that could not be transferred for the user to add manually later.
- Prevent suggestions which include content that cannot be translated. When suggesting content to translate, avoid surfacing content that contains problematic contents.
These fallback strategies seem acceptable since users don't need to translate every single piece of content, and the translation can still be a useful contribution even if it does not include all the source contents.
# Section mappings
Sections of an article represent relevant aspects of a topic. Sections are useful as content units to work withWe want to present for a given article and a language pair which are the sections that are present and missing on each version. In this way, users can select which aspects to expand by translating from another language. They allow to compare which aspects have been covered and which ones may be missing when comparing articles in two different languages. example below shows that "history" is a section present in English but missing in Tagalog for the Ukulele article:
Research in this area can help to identify sections that are available in one given language and missing in another in order to surface opportunities for the user to contribute. In addition, identifying which of those potential contributions are more relevant (in general or for the current user) can be useful. For example, a German user adds a new section about the latest space mission, and another multilingual user interested in the topic is suggested to translate it to Korean.{F31464972, Then the user speaking both German and Korean checks which other sections are still missing in the Korean version to consider adding them from the German one. width=80%}
**Status and fallbacks**
There has been work already from research to identify relevant missing sections that users can add to an article based on those present in other languages. Initial discussions suggest, that this work can be repurposed to identify sections that exist in one given language and are missing in another one.
Until this approach is available, simpler (although limiting) approaches can be used to prevent this advanced section mappingwork to be blocker:
- Focus on target articles with no sections at alloall, to make sure that any section present in another language is not there.
- Focus on articles that were created with Content translation where no additional sections were added after they were published. In this way the section mapping is already available.
- Let the users check (and report) if the page contains a given section.
# Identifying meaningful facts and updates
During this fiscal year we'll focus on sections since there are tools available to deal with those more easily, but there are other relevant updates in content that users may be interested in transferring across languages. For example, a couple of sentences can be capturing a new fact about a new scientific discovery. Transferring this fact to as many languages as possible enables more people to access this knowledge.# Suggestions for sections
Currently our systems understand modIn addition to letting users picking a specifications in terms of characters and edits article to expand with a new section, but that is not enough to understand wwe want to surface suggestions that constitutes a meaningful unit of knowledge incrementusers can translate. For exampleThat is, adding a single numeric value such as the death year may represent a meaningful change about the topic,we want to surface the opportunity to translate the history section of the Ukulele article in the same way we are currently surfacing opportunities to translate new articles in content translation. while a paragraph rewrite that adds no new informaThe idea is illustrated below (note how the suggestion may be irrelevant for propagation despite consisting of more edited text.s list include two parts: "new pages" and "expand with new sections" ):
Research can help to identify meaningful content changes that are worth propagating across languages.{F31464984, width=80%}
**Status and fallbacks**
This is not a priority yetCurrently, but will expandonly articles missing in the usecases of cross-wiki propagation beyond article sectionstarget language are surfaced in the recommendation system.
While this is not available, focus for translations will be limited to sections as described above.
# Suggestions for specific topic areasA possible fallback would be to suggest missing sections for:
When suggesting contents to translate (being articles, sections, or smaller units), users may be more motivated to work on topics in their areas of interest. Allowing users to select general knowledge areas such as "science" or "art" will allow them to easily discover content they are motivated to translate.
Research can help to integrate topic maps and recommendation systems in a way that makes it possible to get suggestions for specific topic areas.
**Status and fallbacks**- translations that the user has created previously (more targeted to encourage the user to continue the work rather than help discover new topics)
Current article recommendation mechanisms are available. Those provide a way to customize based on providing an example article ("article seed") to get similar suggestions to such article. Some work has been done with topic maps, but those seem to be disconnected for now.- articles featured in the source language that are present in the target language as non-featured (as an indicator of the potential for expansion)
# Other related aspects
As a fallback approach, the recent edits by the user can be used as article seeds to find relevant articles that can be used to propose expanding with missing sections, paragraphs, etc.There are two additional aspects that may intersect with the work above and Research can help with:
- **Section relevance.** How we determine which sections are relevant for being translated. This can inform how we present sections to translate, suggest them or notify users that a new section has been created that is worth translating.
- **Custom suggestions.** We are exploring how the current suggestions can focus on a particular topic area. That is, exposing a catalog of topics (Geography, Mats, etc.) for users to select. This is something that could be supported in the current recommendation system by using a seed article, but a similar need would arise for suggested sections.