The [[ https://www.mediawiki.org/wiki/Content_translation | Content Translation tool ]] has supported the creation of over 400,000 articles across a large variety of Wikipedia language projects. We have very little understanding, however, of what parts of the tool work well and what happens to the articles after they have been created. For instance, what types of article sections are translated as-is? what sections are changed substantially? do translated articles see subsequent editing and linking to other articles in the project?
This task links to a number of related projects that could happen around these larger questions about the adoption of translated content on Wikipedia. We would hope for a mixed-methods approach that uses both quantitative analyses (e.g., edit counts, topics that are more frequently translated, etc.) and qualitative analyses (e.g., content analysis of translated pages and subsequent edits, talk pages, etc.).
= Mentors
- @isaac (IRC channel: `#wikimedia-research`)
= Students
- TBD
= Skills
You should have at least one of the below and a strong desire to learn:
- [[ https://wikitech.wikimedia.org/wiki/PAWS | Jupyter Notebooks]]: this is how we would like to present the output of any project. If you have prior experience with Jupyter notebooks, great! If not, we can help you learn it.
- Quantitative: Python has the most generous support for analyzing Wikimedia data and is well-suited to these sorts of analyses. Other languages such as R might be appropriate for data visualization or other related tasks but are secondary.
- Qualitative: a basic level of familiarity or strong desire to learn qualitative coding or other [[ https://en.wikipedia.org/wiki/Content_analysis | content analysis]] techniques.
- Language: not required, but if you are confident in reading at least one language other than English, that can help us to define a focus area and make qualitative analyses and checking/debugging of code much simpler for translations that involve that language
= Set-up
- For all microtasks, this example Jupyter notebook that I put together will be a good resource: https://paws-public.wmflabs.org/paws-public/User:Isaac_(WMF)/Content%20Translation%20Example.ipynb
- Make sure that you can login to the [[ https://wikitech.wikimedia.org/wiki/PAWS | PAWS service]] with your wiki account: https://paws.wmflabs.org
= Micro Tasks
Example analyses are shown in the Content Translation Example notebook linked in Set-up. Using that notebook as a starting point, create your own notebook and do one or both of the qualitative / quantitative analyses. All PAWS notebooks have the option of generating a public link, which can be shared back so that we can evaluate what you did. Use a mixture of code cells and markdown to document what you find and your thoughts.
- Exploratory qualitative analysis
- Exploratory quantitative analysis
= Further reading
- https://wikimediafoundation.org/2019/01/09/you-can-now-use-google-translate-to-translate-articles-on-wikipedia/
- https://www.mediawiki.org/wiki/Content_translation/Product_Definition/analytics
- https://www.mediawiki.org/wiki/Content_translation/Published_translations