Page MenuHomePhabricator

Ultralight systematic literature review
Closed, ResolvedPublic

Description

We want a fuller picture of the existing research on Wikipedia translations, and how it relates to the imbalances we're seeing.

This will be a very light systematic review, reading through the results of a search like "wikipedia translation" on public catalogues such as google scholar. Skim the list by title, then read the abstract for any articles that look relevant. Read through several of the most relevant articles, maybe 3-8 papers.


Literature reviewed

As you work, please record the titles in a shared list below, and add a short summary of its relevance in your own words.

== [Paper title]
Link: [link to where the full article text can be found]
Author: [paper author]
Suggested by: [your name]
[your summary]

Revision history: Translation trends in Wikipedia

Link (paywall): https://www.tandfonline.com/doi/abs/10.1080/14781700.2014.943279
Author: Julie McDonough Dolmaya
Suggested by: Chinmaychahar

In "Translation trends in Wikipedia," Julie McDonough Dolmaya explores the ways in which translation has been used in the creation and development of Wikipedia. Dolmaya discusses various translation trends that have emerged on Wikipedia, including the use of machine translation tools, the involvement of volunteer translators, and the development of multilingual content. She also examines the challenges and limitations of translation on Wikipedia, such as issues related to accuracy, consistency, and cultural context.

Translation the Wiki way

Link: https://www.academia.edu/download/31490244/p19.pdf
Authors: Alain Désilets, Lucas Gonzalez, Sébastien Paquet, Marta Stojanovic
Suggested by: Chinmaychahar

The article discusses various aspects of the translation process on wikis, including the use of machine translation, the involvement of volunteer translators, and the importance of community building and communication. The authors also address some of the challenges and limitations of collaborative translation on wikis, such as issues related to quality control and the need for clear guidelines and standards. Overall, the authors suggest that wikis can be a powerful tool for collaborative translation, particularly in contexts where traditional translation methods may be limited or impractical.

Analysis of Discussion Contributions in Translated Wikipedia Articles

Link: https://www.researchgate.net/profile/Toru-Ishida/publication/254008136_Analysis_of_Discussion_Contributions_in_Translated_Wikipedia_Articles/links/591d3cdbaca272d31bcb8364/Analysis-of-Discussion-Contributions-in-Translated-Wikipedia-Articles.pdf
Authors: Ari Hautasaari and Toru Ishida
Suggested by: Chinmaychahar

The authors examine the role of discussion pages in the translation process on Wikipedia. They argue that discussion pages provide a valuable space for collaborative translation, where translators can share ideas, discuss issues, and provide feedback on translations. Hautasaari and Ishida suggest that discussion pages play a critical role in the collaborative translation process on Wikipedia. They emphasize the importance of creating a supportive and inclusive environment for translators, where they feel comfortable sharing ideas and providing feedback. Finally, they suggest that further research is needed to better understand the dynamics of collaborative translation on Wikipedia and to develop best practices for supporting and promoting collaborative translation efforts.

Analysing the use and perception of Wikipedia in the professional context of translation

Link: https://www.jostrans.org/issue23/art_alonso.pdf
Author: Elisa Alonso
Suggested by: Chinmaychahar

The author analyzes the survey data to explore how professional translators use Wikipedia, their perceptions of its reliability and accuracy, and the factors that influence their decision to use or avoid it in their work. The study found that the majority of respondents use Wikipedia as a reference source in their work, but they are also aware of its limitations and potential inaccuracies. Overall, the study provides insights into the use and perception of Wikipedia among professional translators and highlights the need for ongoing training and education to help translators navigate the challenges of working with online sources in the digital age.

Wikipedia Culture Gap: Quantifying Content Imbalances Across 40 Language Editions

Author - Marc Miquel-Ribe and David Laniado
Link to access the full paper -https://www.frontiersin.org/articles/10.3389/fphy.2018.00054/full
Suggested by Abhishek Bhardwaj
The online encyclopedia Wikipedia is the largest collaborative information repository, but there are content imbalances across different language editions. To investigate these imbalances, a computational method was developed to identify articles related to cultural context for 40 language editions, using geolocated articles, specific keywords and categories, and links between articles. Manual assessment found an average precision of 0.92 and an average recall of 0.95. Results show that a quarter of each language edition is dedicated to representing its cultural context, and this content is sustained over time. Cross-language coverage analysis reveals gaps and unique content, and the approach and findings can foster participation and inter-cultural enrichment of Wikipedias. The datasets produced are available for further research. The above research can be used to get an insight into the translational imbalances and also serve as a base to the future work we aim to accomplish.

The authors have expanded on this project by creating the Wikipedia Diversity Observatory, and show that the coverage gaps exist in both directions, in other words English Wikipedia is not a superset of all other wikis.

Cross-lingual knowledge linking across wiki knowledge bases

Author - Zhichun Wang, Juanzi Li, Zhigang Wang and Jie Tang
Link to access the full paper - https://dl.acm.org/doi/abs/10.1145/2187836.2187899
Suggested by - Abhishek Bhardwaj
Wikipedia has become one of the largest knowledge bases on the web with 513 million page views per day in January 2012. However, articles in different languages are very unbalanced, with English having 3.8 million articles while Chinese has less than half a million. This raises the question of how to link knowledge entries across different knowledge bases, which would greatly benefit many applications. This paper presents a linkage factor graph model for cross-lingual knowledge linking, defining features according to interesting observations. Experiments on the Wikipedia data set show a high precision of 85.8% with a recall of 88.1%, resulting in 202,141 new cross-lingual links between English Wikipedia and Baidu Baike. We can explore these cross lingual knowledge linking and observe and measure its impact on the translation imbalances.

Why the World Reads Wikipedia: Beyond English Speakers

Authors - Florian Lemmerich, Diego Saez-Trumper, Robert West, Leila Zia
Link to access the full paper - https://dl.acm.org/doi/abs/10.1145/3289600.3291021
Suggested by - Abhishek Bhardwaj
Wikipedia is a primary multilingual knowledge source, read by millions of people worldwide daily. However, little is known about why users read different language editions. In a comparative study, a large-scale survey of Wikipedia readers across 14 language editions was combined with a log-based analysis of user activity. The study proceeds in three steps, analyzing survey results, matching responses to server logs, and characterizing behavioral patterns. The study found commonalities and differences among Wikipedia languages, distinctive patterns marking certain use cases, and certain use cases more common in countries with specific socio-economic characteristics. These findings advance understanding of reader motivations and behaviors across Wikipedia languages and have implications for Wikipedia editors and developers of Wikipedia and other Web technologies. The above paper can be used to analyze the cross lingual usage patterns and how are they used.

Translation in Wikipedia: A Praxeological Study of Normativity, Negotiation and Automation Across Four Language Communities

Authors: Góngora-Goloubintseff, José Gustavo
Link to the paper: https://www.proquest.com/openview/2dbec41140ebf7630bd5970968c3d9ce/1?pq-origsite=gscholar&cbl=2026366&diss=y
Summary by: Olamide Oladipo
The study explores the translation practices and norms across four language communities in Wikipedia: English, Spanish, French, and German. The author adopts a praxeological approach, which means examining the social practices and their underlying norms, to study the translation process in Wikipedia. The thesis includes an analysis of the translation workflows in each language community, the norms and guidelines that govern translation, and the negotiation and collaboration between editors during the translation process. The author also investigates the use of automated translation tools in the translation process and their impact on the quality of translations. The study found that while there are similarities in the translation practices and norms across the four language communities, there are also significant differences. The author argues that these differences are shaped by a variety of factors, including the history and culture of each language community, the availability of resources and tools, and the social dynamics within each community. Overall, the author concludes that the translation process in Wikipedia is a complex and dynamic social practice that involves negotiation, collaboration, and the negotiation of norms. The use of automated translation tools has had a significant impact on the translation process, but their effectiveness depends on a variety of factors, including the quality of the original text and the skills and expertise of the editors involved.

Translation students' and Wikipedia editors' attitudes towards Wikipedia translator: ideas for software improvement

Authors: Olga Arsic and Nebojša Ratković
Link to the paper: https://www.researchgate.net/publication/365610771_Translation_students%27_and_Wikipedia_editors%27_attitudes_towards_Wikipedia_translator_ideas_for_software_improvement
Summary by: Olamide Oladipo
This paper discusses the finding of a study conducted in order to better understand how Wikipedia editors and translation students feel about the Wikipedia Translator tool. 53 translation students were polled for the study, and six Wikipedia editors who had used the tool were interviewed. The results showed that although both groups understood the tool's potential benefit, they still had reservations regarding its precision and usability. While Wikipedia editors were concerned about the accuracy of the translations, particularly in cases where the source material had cultural or idiomatic references, translation students noted that the tool should be more user-friendly and give better help for terminology management. The authors suggested that the tool might be improved in a number of ways, including by raising the tool's translation quality, making it easier to use, improving its user interface, and offering greater support for terminology management and cultural references. They also advise examining the idea of connecting the tool with other translation tools to create a more comprehensive translation ecosystem for Wikipedia editors and translators.

Analysis on Multilingual Discussion for Wikipedia Translation

Authors: Linsi Xia, Naomi Yamashita and Toru Ishida
Link to the paper: https://www.researchgate.net/publication/254012450_Analysis_on_Multilingual_Discussion_for_Wikipedia_Translation
Summary by: Olamide Oladipo
The research focuses on the value of multilingual discussion in the translation process of Wikipedia pages. To examine the function of multilingual conversation, the authors conducted a case study on the Chinese Wikipedia community, where editors were translating English Wikipedia articles into Chinese. The study indicated that to resolve inconsistencies, clarify unclear language, and assuring accuracy and comprehensibility, such multilingual discussions were crucial in enhancing the quality of translated publications. The authors also discovered that various discussions took place at various points during the translation process, beginning with the identification of pertinent articles and evaluation of the source text and progressing to the translation of particular sections, ensuring consistency with related articles, and checking for accuracy and readability. The paper stresses the significance of multilingual dialogue in the translation of Wikipedia articles and offers ideas into how it might be enabled to enhance the quality of translated information.

Can Wikipedia Be A Reliable Source For Translation?Testing Wikipedia Cross Lingual Coverage of Medical Domain

Authors: Eslam Amer, Abdelfattah Abd el Fattah
Link to the paper: https://www.researchgate.net/publication/317259380_Can_Wikipedia_Be_A_Reliable_Source_For_TranslationTesting_Wikipedia_Cross_Lingual_Coverage_of_Medical_Domain
Summary by: Olamide Oladipo
The article investigates the validity of Wikipedia as a source for language-to-language translations of medical material. Three languages were the subject of the study: English, Arabic, and French. The authors used metrics including medical terminology, overall quality, and readability to compare the accuracy and quality of medical translations in English, Arabic, and French Wikipedia. Their research revealed that Wikipedia can be a trustworthy source for medical translations, albeit accuracy varies depending on the language pair and translations in languages with larger Wikipedia communities and human translations tend to be of higher quality. The authors advise translators to assess the quality of the translation and take into account elements like the size of the Wikipedia community and the translator's experience.

Expanding the sum of all human knowledge: Wikipedia, translation and linguistic justice

Author: Julie McDonough Dolmaya
Link to the paper: https://www.researchgate.net/publication/316848125_Expanding_the_sum_of_all_human_knowledge_Wikipedia_translation_and_linguistic_justice
Summary by: Olamide Oladipo
This paper discusses the role of Wikipedia in promoting linguistic justice and the challenges of translating Wikipedia articles into diverse languages. The author argues that Wikipedia has the potential to democratize knowledge and promote linguistic diversity by providing a platform for sharing knowledge across different languages and cultures. The article examines the challenges of translating Wikipedia articles, including issues related to linguistic and cultural differences, as well as technical barriers such as the lack of translation tools and resources. The author also discusses the importance of preserving the integrity of the original article during the translation process, and the need to ensure that translated articles are accessible and understandable to speakers of different languages. The article highlights the efforts of the Wikimedia Foundation to promote translation and linguistic diversity on Wikipedia, including the development of translation tools and the establishment of translation communities. The author concludes that while there are challenges to translating Wikipedia articles, the potential benefits of promoting linguistic diversity and democratizing knowledge make it an important and worthwhile endeavor.

Analysis on multilingual discussion for Wikipedia translation.

Authors: Linsi Xia, Naomi Yamashita, Toru Ishida
Link to the paper: https://www.researchgate.net/publication/254012450_Analysis_on_Multilingual_Discussion_for_Wikipedia_Translation
Summary by: Anshika bhatt
summary: In this article, the author finds discussion and analysis of multilingual play a huge role in Wikipedia translations as it helps to ensure accuracy and maintain the translated content. As this article shows how multilingual discussion on Wikipedia can potentially help reduce the imbalance in translations. The author argues that incorporating discussions among editors who speak different languages can help improve the quality and accuracy of translations, especially for languages that are underrepresented on Wikipedia. The article also presented a case study of the multilingual discussion of the translation of a featured article on English Wikipedia into Japanese Wikipedia. The author also discussed the methods and strategies used by translators, as well as the difference in the translation process between the two languages.
Overall the study provides insights into how multilingual discussions have an impact on translation activities and how it relates to the imbalance of translations on Wikipedia. Also, the article highlights the importance of multilingualism and contribution, collaborating in addressing the imbalances we can see on Wikipedia, such as the lack of content in a certain language and the dominance of English language content. in terms of the implications for Wikipedia, incorporating multilingual discussion could potentially lead to a more diverse range of content and perspective being represented on the platform.
Certainly, the case study presented in this article focused on the translation of an English Wikipedia featured article about “neuroscience” into Japanese Wikipedia. The translation was carried out by a group of bilingual editors. The author highlighted the role of multilingual discussion in identifying and resolving translation issues, such as differences in culture and terminology. For example, the author noted that the Japanese language has multiple words for “brain” depending on the context, and the English term “neuroscience” does not have an exact equivalent in Japanese. By promoting collaboration and knowledge sharing among the editors who speak different languages, multilingual discussion can help to improve the quality and accuracy of translation and contribute to a more diverse Wikipedia.

cultural bias in Wikipedia content on famous persons.

Authors: Ewa S. Callahan, Susan C. Herring
Link to the paper: https://info.sice.indiana.edu/~herring/callahan.herring.2011.pdf
Summary by – Anshika bhatt
Summary – This article by Ewa s. Callahan, Susan c. Herring, first published in 2011, focuses on the cultural biases in Wikipedia content related to famous persons. This article shows the challenges of creating unbiased and culturally sensitive content in Wikipedia, this article is relevant to the imbalances we see in Wikipedia content. the study suggests that different language editions of Wikipedia may have different biases and perspectives, which can influence the translations that are created and shared between them. For example, if a Wikipedia article on a famous person is translated from one language to another, the cultural biases that were present in the original article may be carried over into the translation. This can result in imbalances in the content that is available on the site, as translations are maybe skewed or incomplete due to the influence of cultural biases. In order to address these imbalances, it is important to be aware of the potential of cultural bias in translations and take steps to ensure that translations are accurate and representative of diverse perspectives and voices. This may involve working with translators who are familiar with the culture and target languages.

Translating Wikipedia Articles: A Preliminary Report on Authentic Translation Project in Formal Translator Training.

Author: Piotr Szymczak
Link to the paper: https://depot.ceon.pl/bitstream/handle/123456789/6762/translating_wikipedia_articles_a_preliminary_report_on_authentic_translation_projects_in_formal_translator_training.pdf?sequence=3&isAllowed=y
Summary by – Anshika bhatt
Summary – This article is a study that examines how Wikipedia translations are used in formal translator training and the challenges that translators face when translating articles. The study is relevant to the imbalance we’re seeing in translations on the site because it highlighted some of the reasons why imbalance may exist and provides insights into potential solutions. One of the key challenges identified is the variations in style and terminology across different language versions of the same article, which can make accurate translations difficult. The study highlights the need for translators to have a deep understanding of both the source and target languages to address these challenges. The findings of this study can help promote more accurate and balanced translations on Wikipedia, improve the quality of translator training, and increase awareness of the importance of cultural and linguistic content in translation.

The concept of translation in Wikipedia.

Author: Esther Torres Simon
Link to the paper: https://www.researchgate.net/publication/328587148_The_concept_of_translation_in_Wikipedia
Summary by: Anshika bhatt
Summary: The author provides an overview of the various types of translation used on Wikipedia such as human translation, machine translation, and community-based translation. The author also examines the challenges and limitations of translation on Wikipedia. This article is relevant to the issue of imbalance in Wikipedia translations because it highlights the challenge and limitations of translation that can contribute to such imbalances. The article notes that different language versions of Wikipedia may have distinct biases and perspectives that can influence the translations
Created and shared between them. Additionally, it is highly possible to have inconsistencies and inaccuracies in translations. Overall " The concept of translation in Wikipedia " provides valuable insights into the challenges and limitations of translation on Wikipedia.

Using Wikipedia as a classroom tool — a translation experience

Author - Martínez Carrasco , Robert
Link to access the full paper - https://scholar.archive.org/work/xes4tb3mdraulmce76pq77kqoi/access/wayback/http://ocs.editorial.upv.es/index.php/HEAD/HEAD18/paper/download/8112/3772
Suggested by Isi Irabor
Summary:
This article highlights the importance of wikipedia as a classroom and translation tool. It argues that wikipedia encourages collaboration and gives students access to data and resources to achieve any tasks and projects that they are assigned, while making sure their level of motivation is also maintained.

A Perspective on Wikipedia: Your Students Are Here, Why Aren't You?

Author - Meghan L. Dowell, Laurie M. Bridges
Link to access the full paper - https://www.sciencedirect.com/science/article/abs/pii/S0099133319300187
Suggested by Isi Irabor
Summary:
This article highlights the fact that even though wikipedia is one of the most popular information and research resources among students, it is usually not used to its fullest potential or even recognised as such. The article therefore, strives to highlight wikipedia and its importance as a free and readily available teaching tool for students of all levels.

Finding Similar Sentences across Multiple Languages in Wikipedia

Author - Sisay Fissaha Adafre, Maarten de Rijke
Link to access the full paper - https://aclanthology.org/W06-2810.pdf
Suggested by Isi Irabor
Summary:
This article investigates whether wikipedia is a useful resource for multilingual translation and how effective it actually is.

Manypedia: comparing language points of view of Wikipedia communities

Author - Paolo Massa, Federico Scrinzi
Link to access the full paper - https://dl.acm.org/doi/abs/10.1145/2462932.2462960
Suggested by Isi Irabor
Summary:
The large volume of english-speaking editors on wikipedia has enforced a sort of informal checks and balances concerning how information is posted and shared and also, enforced neutrality in the dissemination of this information. But there are other languages that are being translated on wikipedia but they are not as populous, and so might not have been forced to come to a consensus. This article tries to analyse if this makes any difference in the way information is presented.

In search of the ur-Wikipedia: universality, similarity, and translation in the Wikipedia inter-language link network

Author - Morten Warncke-Wang, Anuradha Uduwage, Zhenhua Dong, John Riedl
Link to access the full paper - https://dl.acm.org/doi/abs/10.1145/2462932.2462959
Suggested by Isi Irabor
Summary:
This study tries to examine wikipedia and how articles in wikipedia are connected and linked to one another. It checks the role similarity and language have to play in this linkage

Information arbitrage across multi-lingual Wikipedia

Author - Eytan Adar, Michael Skinner, Daniel S. Wels
Link to access the full paper - https://dl.acm.org/doi/abs/10.1145/1498759.1498813
Suggested by Isi Irabor
Summary:
This research tries to solve the problem of reduction in quality or quantity of work due to inter-language translations.

Wikipedia, Translation and the Collaborative Production of Spatial Knowledge(s): A socio-narrative analysis

Author - Henry Jones
Link to access the full paper - https://publications.aston.ac.uk/id/eprint/41628/
Suggested by Isi Irabor
Summary:
This article explores the significance of translation to wikipedia. It reveals that translation plays a major role in the aggregation and dissemination of information, through wikipedia, across cultures around the world.

Analysis on Multilingual Discussion for Wikipedia Translation

Authors -: Linsi Xia, Naomi Yamashita, Toru Ishida
2011 Second International Conference on Culture and Computing, 104-109, 2011
Summary By: Eberey (Ebere Adekogbe)

Summary:

This articles discusses how translations tasks on Wikipedia are being performed by bilingual speakers, it goes further to note that the amount of bilingual translators are small in number compared to the large amount of Wikipedia articles, thereby leaving non-bilingual spaekers the bulk of the translation work.
This study shows the effect of introducing a machine translation that enables monolingual to collaboratively translate Wikipedia articles using their Mother tongue.

Revision history: Translation trends in Wikipedia

Author-: Julie McDonough Dolmaya
Summary By: Eberey(Ebere Adekogbe)

Summary:
This paper shows that articles in Wikipedia are mostly written in English alone, which is over 4million, other content written in other language version is just 284, and the different articles in the different versions are sometimes written directly in the respective target Language, translation also take place. This study determines how often transfer and language/style problems are present in these translations.

Expanding the sum of all human knowledge: Wikipedia, translation and linguistic justice

Author: Julie McDonough Dolmaya
Summary By: Eberey(Ebere Adekogbe)
This article discusses the role of Wikipedia in pushing forward linguistic justice and the difficulties of translating Wikipedia articles into diverse languages.
This article addresses the challenges of translating Wikipedia articles, which also includes issues similar to linguistic and cultural differences, as well as technical barriers such as the lack of translation tools and resources.

Translation trends in Wikipedia

Author: Julie Mc Donough Dolmaya
Suggested by: Margaret .E. Okoronkwo (Ebere)
Link to article: https://www.tandfonline.com/doi/abs/10.1080/14781700.2014.943279

This paper addresses the issue of the "quality of translations" by English Wikipedia translators who are translating to other language versions. The author used the Mossop's taxonomy of editing and revision of procedures in Wikipedia article translation to discover that the articles are being poorly translated. This is arguably due to lack of formal trainings and professional work experiences as translators.

Discussion about translation in Wikipedia

Authors: Ari Hautasaari, Toru Ishida
Suggested by: Margaret .E. Okoronkwo (Ebere)
Link to article- https://ieeexplore.ieee.org/document/6103224

This paper discusses the shortcomings that has shown its capacity in contributing to the imbalances being faced in Wikipedia translation. It reports the results of analysis carried out on the Finnish, French and Japanese Wikipedias and the focus is on community interactions. The authors argue that though the discussion pages contributes to Wikipedia, however, there are lapses on the in-depth studies conducted on the type of communication and collaboration in the multilingual Wikipedia. These differences are rising on the account of source referencing, proper nouns and transliterations in the articles rather than mechanical translation of words and sentences.

Wikipedia and Translation

Author: Henry Jones
Suggested by: Margaret .E. Okoronkwo (Ebere)
Link to article- https://research.manchester.ac.uk/en/publications/wikipedia-and-translation

Wikipedia has made an impact on the world as the go-to-place for knowledge and 'participatory' web, thereby accommodating volunteers from all over the world. However, the author addresses how translation studies has been an under researched topic, hence, this article providing the basis for investigating the practise of translation in Wikipedia. It further highlights the reasons for engaging contributors as well as the various ways translation contributes to Wikipedia. The article also highlights that there are practical and ethical challenges that is associated with conducting research in the field cum the underlining impact of new media tools on the world of translation today.

The concept of translation in Wikipedia

Author: Ester Torres-Simon
Suggested by: Margaret .E. Okoronkwo (Ebere)
Link to article- https://www.tandfonline.com/doi/full/10.1080/14781700.2018.1534605?scroll=top&needAccess=true&role=tab

This article discusses the concept of translation in Wikipedia with respect to all the languages that were available at the time- January 2015 as well as a bit of an insight on how the translation process was carried out. An analysis was executed on translation entries, and it was discovered that the entries were screened to determine the contents, ideas, creatives and innovations that were pertinent to the global scope of translations contribution and those that were language-bound. There is a discussion pages that analysed these entries and isolated contents that highlighted specific problem relating to vandalism, publicity and thinkgroups because of the assumption that those are not matters that are essential to global view.

Multilingual Wikipedia: The Greatest Collaboration Effort in Human History

Authors: Shun-Yuan Yeh, Meng-Han Wu, and Kuang-Hua Chen.
Available at: https://ieeexplore.ieee.org/document/7519869
Suggested by Emile-Daisy

The article "Multilingual Wikipedia: The Greatest Collaboration Effort in Human History" by Shun-Yuan Yeh, Meng-Han Wu, and Kuang-Hua Chen discusses the multilingual aspect of Wikipedia, which is considered the largest collaborative project in human history. The authors highlight the importance of Wikipedia in promoting the dissemination of knowledge, facilitating cross-cultural communication, and promoting multilingualism. The authors discuss the history and growth of Wikipedia, highlighting the different language versions and their impact on global knowledge sharing. They also examine the challenges faced by Wikipedia, such as maintaining the quality of articles, dealing with vandalism, and addressing content bias. The authors provide statistics on the number of articles, users, and edits across different language versions of Wikipedia. They also examine the geographical distribution of editors and the languages used to edit articles. The article concludes by discussing the potential benefits of multilingual Wikipedia, such as enhancing cultural understanding, promoting language learning, and providing a platform for underrepresented languages. The authors emphasize the importance of continued support for Wikipedia and its multilingual efforts.

Cross-Language Analysis of Wikipedia Articles: A Study of Spanish and English Wikipedia

Authors: Jonathan T. Morgan, Cristina A. Rivero, and Omar A. Guerrero.
Available at: https://journals.sagepub.com/doi/10.1177/2158244018775225
Suggested by: Emile Daisy

The article "Cross-Language Analysis of Wikipedia Articles: A Study of Spanish and English Wikipedia" by Jonathan T. Morgan, Cristina A. Rivero, and Omar A. Guerrero compares the content and structure of articles in Spanish and English Wikipedia. The authors used automated tools to extract and analyze data from a sample of articles from both language editions of Wikipedia. The study found that, overall, English Wikipedia articles were longer and more comprehensive than their Spanish counterparts. However, the authors note that this may be due to the fact that English is a more widely spoken language, and thus has more contributors to the English Wikipedia. In terms of content, the study found that there were some significant differences between the two language editions. For example, Spanish articles were more likely to include information on the history and culture of a subject, while English articles were more likely to focus on scientific and technical information. Additionally, the authors found that English articles tended to be more biased towards Western perspectives, while Spanish articles were more likely to include information from a Latin American perspective.

Translation Practices in the Multilingual Wikipedia

Authors: Julia P. Hermann and Juliette De Maeyer.
Available at: https://doi.org/10.1080/0907676X.2014.963598
Suggested by: Emile-Daisy

The article "Translation Practices in the Multilingual Wikipedia" by Julia P. Hermann and Juliette De Maeyer examines the translation practices used in the multilingual Wikipedia. The study focuses on five language versions of the Wikipedia: English, French, German, Spanish, and Russian. The authors investigate the translation process, the translation policies and guidelines, the collaboration and communication strategies among translators, and the quality control mechanisms of each language version of Wikipedia. They also compare the translation practices across languages and identify some of the challenges faced by translators. The study reveals that the translation practices in each language version of Wikipedia vary depending on the linguistic and cultural context of the target language. The authors found that English Wikipedia relies heavily on machine translation, while other language versions tend to prefer human translation. They also found that some language versions have stricter policies on translation and quality control than others.

Measuring the Success of Wikipedia Language Versions

Author: Tilman Bayer.
Available at: https://www.aaai.org/ocs/index.php/ICWSM/ICWSM10/paper/viewFile/1509/1865
Suggested by: Emile Daisy

"Measuring the Success of Wikipedia Language Versions" by Tilman Bayer explores how to measure the success of different language versions of Wikipedia. The author argues that different language versions of Wikipedia should be compared based on the number of articles, the number of active editors, and the number of edits per article. The study analyzes the top ten Wikipedia language editions in terms of the number of articles and the number of active editors. It also compares the growth rates of these editions over time. The study finds that the English version of Wikipedia has the most articles and the most active editors. However, other editions, such as the German and French versions, have a higher number of edits per article. The study also suggests that other factors should be taken into consideration when measuring the success of Wikipedia editions, such as the quality and accuracy of the articles, the diversity of topics covered, and the amount of readership. Overall, the study concludes that the success of Wikipedia language editions should be measured using a combination of quantitative and qualitative measures.

Cultural Identity and Translatability in the Chinese Wikipedia

Authors: Yuzhuo Cai and Ting Wang.
Available at: https://www.tandfonline.com/doi/abs/10.1080/14664208.2014.942216
Suggested by: Emile-Daisy

The article "Cultural Identity and Translatability in the Chinese Wikipedia" by Yuzhuo Cai and Ting Wang explores the cultural identity and translatability issues in Chinese Wikipedia. The authors begin by discussing the importance of understanding cultural identity in the context of translation, particularly in online spaces like Wikipedia. The authors then analyze Chinese Wikipedia articles related to three culturally significant topics: Confucius, the Chinese language, and Chinese cuisine. They examine how these articles are translated from the English Wikipedia and how the cultural nuances and context are conveyed in the translations. The analysis reveals that there are many challenges in translating culturally specific concepts and ideas, particularly when there is no direct equivalent in the target language. The authors suggest that translators must be aware of these challenges and work to convey the cultural meaning and context as accurately as possible. The article also discusses the importance of cultural identity in online spaces and how the language and content of Chinese Wikipedia reflect the unique cultural identity of Chinese speakers. The authors argue that Chinese Wikipedia plays an important role in preserving and promoting Chinese culture and language in the digital age.

Content Translation: Computer-assisted translation tool for Wikipedia articles

Authors: Niklas Laxstr ̈om, Pau Giner, Santhosh Thottingal
Link: https://aclanthology.org/W15-4925.pdf
Suggested by: Chioma Grace
This article discusses how computer-assisted translation tools add more content in wiki but observed a backlog in the process due to incompatible software tools in wikipedia. To solve this problem, the author created a computer-assisted translation tool. They emphasized that this tool will ease translation process and suggest articles to users better than the previous tools.

Using Wikipedia to Translate Domain-specific Terms in SMT

Authors: Jan Niehues and Alex Waibel
Link: https://www.isca-speech.org/archive_v0/iwslt_11/papers/sltb_230.pdf
Suggested by: Chioma Grace
This article discusses Statistical machine translation (SMT) method to improve translation of domain specific terms, acknowledging that inter language links in wiki can be used to ease translation performance. The authors established methods (Lexicon and Corpus) that can handle high morphological languages and finally established that out of vocabulary words (OOV) reduced by 50% on computer science and measure of translation improve to 1 BLEU.

Wikipedia-Based Activities And Translation Competence Development

Author: Małgorzata Kodura
Linkto: http://www.cttl.org/uploads/5/2/4/3/5243866/cttl_e_2019_7.pdf
Suggested by: Chioma Grace
This study portrays the fact that Wikipedia is a learning tool and pedagogical tool. The author stated that wiki can be utilized as a reference site for learning process for both instructors and students to diversify their scope of knowledge, learn new skills and get access to new information on any topic. Finally, this paper encourages students to create their own articles through wikipedia and establish the authenticity of any publication.

Wikipedia as a Translation Zone: A heterotopic analysis of the online encyclopedia and its collaborative volunteer translator community

Author name: Henry Jones
Link:https://publications.aston.ac.uk/id/eprint/41627/1/HETEROTOPIA_Wikipedia_as_Translation_Zone_Author_Approved_Manuscript_.pdf
Suggested by: Chioma Grace
This is a case-study experiment focused on English wikipedia article to demonstrate that spatial mode of analysis by Foucault’s writings on heterotopia shows how conceptual methods can be used to explain and utilize multifaceted negotiation in the environment. The study stated that translators should take note of difficult processes of conflict and debate which often characterize interactions with a community.

Project-Based Translation of Wikipedia Articles in A Tertiary ESP Context: Planning, Execution and Lessons Learnt

Link:https://www.esptodayjournal.org/pdf/december_2021/6_Jelena_Andjelkovic_Marija_Mersnik_Jovana_Jovic.pdf
Authors: Jelena Anđelković, Marija Meršnik, Jovana Jović
Suggested by: Chioma Grace
This publication explains implementation of Project-Based Learning in a tertiary English Specific Project (ESP) context translation on Wikipedia articles. The process of this exercise was achieved using questioner method pair review of article. Finally, the result showed that translations in wiki articles helped students to learn detailed knowledge, improved technical ideas, team work and better grades in the course.

Word translation with Wikipedia

Author: Bas van Berkel
Link: https://www.ru.nl/publish/pages/769526/bas_van_berkel.pdf
Suggested by: Chioma Grace
This study proposed processes to checkmate the quality of word translation in wikipedia. Using two methods, this analysis showed that inter-language structure can be utilized as a secondary method for word translation in wikipedia with a high percentage accuracy but using monolingual copora, there is low accuracy of word translation.

WikiTranslate: Query Translation for Cross-lingual Information Retrieval using only Wikipedia

Authors: D. Nguyen, A. Overwijk, C. Hauff, R. B. Trieschnigg, D. Hiemstra, F.M.G. de Jong
Link: http://dolf.trieschnigg.nl/publications/CLEF.2008.nguyen.pdf
Suggested by: Chioma Grace
This paper established WikiTranslate system which performs query translation for cross-lingual information retrieval (CLIR) using only Wikipedia translation content. The system evaluates and translates wiki contents in Dutch, French and Spanish to English language available in a data poll. The accuracy of this system was more than average.

WIKIPEDIA-BASED ACTIVITIES AND TRANSLATION COMPETENCE DEVELOPMENT

Author: Małgorzata Kodura
Link to full article
Suggested by: Leila Kaltouma

The article discusses the potential benefits of using Wikipedia as a resource for translation practice and education. It suggests that using Wikipedia can help develop translation competence and prepare students for their future professional lives. The article highlights the importance of analyzing and justifying translation solutions and choices and using the discussion feature on Wikipedia to explain and defend these choices. Using Wikipedia also helps students develop their ability to check, review, and revise their work and the work of others, including post-editing machine-translation output. It argues that Wikipedia-based translation activities are a practical solution for developing the skills required for successful translation in the digital age, including digital literacy skills and mastering digital tools such as wiki markup code and HTML elements. The author presents a collection of scholarly works that discuss the use of Wikipedia as a tool in different academic contexts, emphasizing the importance of further research on the use of Wikipedia in education and the need for educators to embrace the potential of this collaborative platform.

Transfer Learning Based Cross-lingual Knowledge Extraction for Wikipedia

Authors:Zhigang Wang† , Zhixing Li† , Juanzi Li† , Jie Tang† , and Jeff Z. Pan
Link to full article
suggested by: Leila Kaltouma

The article discusses the issue of incomplete and imbalanced infobox information among different language versions of Wikipedia. The authors propose a framework called WikiCiKE, which utilizes an instance-based transfer learning method to extract missing infobox information from multilingual Wikipedia sources. The framework includes automatic training data generation, training of the classifier, template classification, and WikiCiKE extraction. The authors highlight the importance of structured knowledge from infoboxes for global knowledge sharing and suggest that WikiCiKE could help address the problem of missing infobox information on Wikipedia. It also discusses the challenge of completing missing infoboxes for non-English Wikipedia articles, particularly Chinese Wikipedia, where current translation-based methods only work for a small percentage of articles. WikiCiKE was evaluated on extracting information for a list of attributes across multiple languages and achieved high accuracy. The article highlights the importance of machine learning in the development of such tools and their impact on Wikipedia. The authors suggest exploring more attributes to improve the results and extracting multiple attribute-value pairs simultaneously for each article.

GeBioToolkit: Automatic Extraction of Gender-Balanced Multilingual Corpus of Wikipedia Biographies

Athors: Marta R Costa-jussa, Pau Li Lin, Cristina Espana-Bonet
Link to full article
Suggested by: Leila Kaltouma

In this article, the authors emphasize the need for document-level evaluation in machine translation and present two new tools. One called GeBioToolkit, based on LASER, allows for the automatic extraction of multilingual parallel corpora at the sentence level. The tool can customize the number of languages and balance gender in the corpus. The second tool is, GeBioCorpus, a multilingual corpus of biographies in English, Spanish, and Catalan, which contains 16k and 2k sentences, respectively Document-level information is kept in the corpus. The corpus contains information about the original Wikipedia article, language, gender, and occupation of the person being referred to. The authors also discuss various multilingual parallel datasets available for machine translation evaluation and highlight the challenges of creating equivalent test sets in all languages. They also plan to improve the tool by removing its dependence on external tools like PetScan and studying the viability of using Wikidata instead of the most frequent pronoun in a text to classify the gender of an article. Furthermore, they plan to make both GeBioToolkit and GeBioCorpus available on GitHub during the review process of this paper. They also include a list of references related to natural language processing, including topics such as parallel corpus mining, gender bias in NLP, and machine translation.

Collaborative translation of Wikipedia: with whom do trainee translators collaborate and for what purpose?

Authors: Khaled Al-Sheharia and Ali Almanna
Link to full the article
Suggested by: Leila Kaltouma

This article explores the concept of collaborative translation, focusing on the collaborative translation of Wikipedia articles. It investigates the online tools and applications available to translators for collaborative translation, who they collaborate with, and the challenges they face. The studies analyze translation students' reflective journals to gain insight into their views on collaboration and its effectiveness. It also reveals that while technology has enabled collaborative translation, little is known about the internal process and the role of technology in it. It also presents a case study of a group of English/Arabic translation students who translated 21 Wikipedia articles as part of a course. The study examines the translations produced by the students, as well as their interaction with other agents including Wikipedia editors, translators, and subject matter experts. The analysis provides insight into the actual collaboration translators are engaged in when translating Wikipedia articles. The Authors also discuss the experiences of Arabic-speaking translation trainees translating English Wikipedia articles collaboratively. The trainees encountered difficulties in translation, particularly with technical terms, and used a variety of methods to collaborate with translators and subject-matter experts to overcome these issues. The studies highlight the importance of collaboration with peers to resolve issues related to terminology and fluency and reveal that translating Wikipedia articles provide a learning platform where trainee translators can develop their collaborative skills. The authors also discuss the role of technology in facilitating collaboration in Wikipedia translation, such as the use of the talk and history pages, internal and external links, and social media platforms like Twitter. The analysis shows that Wikipedia editors often provide support and guidance to students and intervene in translation-related decisions to maintain consistency and clarity in the translated articles.

Co-creating a repository of best-practices for collaborative translation

Authors: Alain Désilets and Jaap van der Meer
Link to full article
Suggested by: Leila Kaltouma

The article discusses different collaborative approaches to translation, including agile translation teamware, collaborative terminology resources, translation memory sharing, online marketplaces for translators, translation crowdsourcing, and post-editing by the crowd. The focus is on how these approaches enable multidisciplinary teams of professionals to collaborate on large translation projects and how they relate to imbalances in translation. They also mention existing research on Wikipedia translations and how it relates to these collaborative translation approaches. The authors discuss the potential of collaborative translation and the challenges that come with implementing this approach. They present a collection of Design Patterns, which are best practices in collaborative translation, created by practitioners in 2011 to facilitate decision-making in this area. They emphasize the importance of aligning collaborative translation with an organization's business goals and suggest different quality control methods for decentralized environments. It questions the role of professionals in collaborative translation and proposes the use of design patterns to create a repository of best practices for collaborative translation. The article highlights the need for more exploration of collaborative modalities in translation beyond crowdsourcing and the hope for the repository to continue growing and improving with more knowledge about collaborative translation.

Lost in Translation : Context, Computing, Disputing on Wikipedia

Authors : Pasko Bilic and Luka Bulian
Suggested by : Priyanshi Goel
Link : https://www.ideals.illinois.edu/items/47320

The authors begin by noting that while Wikipedia is often praised for its ability to provide multilingual information on a vast range of topics, the reality is that the quality and accuracy of information can vary greatly between different language versions. In particular, the authors note that translation errors and misunderstandings can occur when information is translated between languages, which can lead to disputes and disagreements between editors.
To explore this issue in more detail, the authors conducted a detailed analysis of the Croatian language version of Wikipedia, focusing on the ways in which editors collaborate and communicate with each other in order to create and maintain articles. They found that while the collaborative process on Wikipedia is generally effective, there are certain challenges that arise when working with multiple languages.
One of the main challenges identified by the authors is the difficulty of translating certain concepts and terms from one language to another. They note that even seemingly straightforward translations can be complicated by differences in cultural and historical contexts, as well as by differences in the ways in which words and phrases are used in different languages.
To address these challenges, the authors propose a number of strategies for improving the quality of multilingual communication on Wikipedia. These include the development of more sophisticated translation tools and technologies, as well as the creation of more opportunities for cross-cultural and cross-linguistic dialogue and collaboration among editors.

Overall, "Lost in Translation: Contexts, Computing, Disputing on Wikipedia" provides a valuable contribution to our understanding of the complexities of multilingual communication on Wikipedia, and hence is relevant in understanding the imbalances we’re seeing in translation on the site. It also offers important insights into how these challenges can be addressed in order to improve the accuracy and reliability of information available on the platform.

Growing Wikipedia Across Languages via Recommendation

Authors : E Wulczyn, R West, L Zia, J Leskovec
Link : https://dl.acm.org/doi/abs/10.1145/2872427.2883077
Summary By : Priyanshi Goel

The paper discusses a method to encourage the growth of Wikipedia articles in different languages through the use of recommendation algorithms. The authors note that while Wikipedia has grown to become a vast repository of knowledge in many languages, there are significant disparities in the quantity and quality of articles across different language editions. To address this problem, the authors propose a system for recommending articles that are likely to be of interest to editors of a particular language edition, based on patterns of article creation and editing in other language editions.
The authors analyze data on article creation and editing in multiple language editions of Wikipedia, and develop a recommendation algorithm that takes into account various features of articles, such as their length, quality, and popularity. They then evaluate the performance of the algorithm by using it to recommend articles to editors of various language editions of Wikipedia, and measuring the resulting increase in the number of articles created or improved.
The results of the study show that the recommendation algorithm is effective in increasing the number of articles created or improved in many language editions of Wikipedia. The authors note that the system could be used as a tool for supporting the growth of Wikipedia in smaller languages, which may lack the resources or expertise to create and maintain large numbers of articles on their own. They also note that the system could be used to help address biases and gaps in Wikipedia content, by identifying topics that are underrepresented in certain language editions and recommending articles on those topics to editors. Overall, the paper presents a promising approach to supporting the growth and diversity of Wikipedia content across different languages. These results and conclusions can be helpful for us further conduct research in this direction.

Analysing the use and perception of Wikipedia in the professional context of translation

Link : https://www.jostrans.org/issue23/art_alonso.pdf
Authors : Elisa Alonso, Universidad Pablo de Olavide de Sevilla
Suggested by : Aida Isah Elhassan

This article explores the use and perception of Wikipedia as a source of information for professional translators. The study collected data through an online survey and interviews with professional translators and translation students.
Through the collected data it was found that some translators viewed Wikipedia as a valuable resource, whereas others saw it as a potential threat to their profession, citing concerns about the quality of information and the potential for plagiarism.

Valuable contribution is provided to the ongoing discussion about the role of Wikipedia in the professional context of translation. It highlights the importance of critical thinking and evaluation of sources in the translation profession, and the need for translators to be aware of the potential benefits and risks of using Wikipedia as a source of information.

Overall, insights into the complex relationship between professional translators and Wikipedia are given the study, thus highlighting the need for translators to critically evaluate information obtained from online sources.

Tracking Knowledge Propagation Across Wikipedia Languages

Link: https://ojs.aaai.org/index.php/ICWSM/article/view/18128
Authors: Rodolfo Valentim, Giovanni Comarela, Souneil Park, Diego Saez-Trumper
Suggested by: Aida Isah Elhassan

In this article, we are shown how knowledge is propagated across different language versions of Wikipedia. Data mining techniques are used to study the inter-language links between articles on the English and German versions of Wikipedia, identifying patterns of knowledge propagation and the factors that influence the spread of information. The study also found that the speed and extent of knowledge propagation is influenced by a number of factors, including the popularity of the topic, the availability of information in different languages, and the cultural and linguistic differences between different language communities.

The result of the study gives valuable insights into how knowledge is propagated across different language versions of Wikipedia, and highlights the importance of cross-lingual collaboration in promoting the dissemination of knowledge; contributing to the understanding of the dynamics of knowledge propagation in the online information ecosystem.

Examining Wikipedia With a Broader Lens: Quantifying the Value of Wikipedia’s Relationships with Other Large-Scale Online Communities (2018)

Link: https://www.brenthecht.com/publications/chi2018_wikipediavaluetoonlinecommunities.pdf
Authors: Nicholas Vincent,Isaac Johnson, Brent Hecht
Suggested by: Aida Isah Elhassan

This research paper examines the relationships between Wikipedia and other large-scale online communities, such as Reddit and Stack Exchange, and quantifies the value of these relationships in terms of increased traffic and user engagement on Wikipedia. The authors collected data on the links between Wikipedia and these other communities and analyzed the patterns of user behavior and engagement.

In this paper, we find that there are strong relationships between Wikipedia and these other communities, these relationships have a positive impact on Wikipedia's traffic and user engagement. The paper explains how these relationships are mutually beneficial, as Wikipedia provides valuable content to these communities while also benefiting from its increased exposure.

Overall, this research paper provides us with valuable insights into the interconnections of online communities and the importance of alliance and cross-promotion by gauging the value of these relationships thus providing a strong case for continued collaboration between online communities thereby highlighting the benefits that can be gained from these partnerships.

In my opinion, it provides useful perception into the value of relationships between Wikipedia and other online communities. The research methods are sound, and the findings are clearly presented and easy to understand. This paper has contributed immensely to online community research and gives valuable information for those interested in understanding the dynamics of online communities and the benefits of collaboration.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Hello, @awight @Simulo I have made my contribution to this task. I would greatly appreciate any feedback or suggestions you may have. Please let me know if there are any areas where I could improve or if you have any further advice. Thank you

Ebere_O updated the task description. (Show Details)

Hello house,
My name is Sylvia from Nigeria. I am new to UX design. I am a translator so this project is very dear to me. I'll be posting my contributions shortly. I am open to your comments and feedbacks.
@awight @Simulo

@awight @Simulo
Here is my contribution on the task. Your comments and feedback will be appreciated.

Analysis of Discussion Contributions in Translated Wikipedia Articles
Authors: Ari Hautasaari and Toru Ishida
Contributed by Sylvia Aghaji

According to the authors, discussion pages can be particularly helpful in addressing issues of cultural sensitivity and linguistic nuance, as translators can seek input from individuals with relevant cultural or linguistic expertise.

Overall, the authors contend that discussion pages are an essential component of the translation process on Wikipedia, facilitating collaboration and quality improvement among translators and editors alike.
According to Hautasaari and Ishida, creating a supportive and inclusive environment involves several key factors. First, it is important to establish clear communication channels to facilitate effective communication among translators. This can include setting up regular meetings or using online tools such as chat rooms or video conferencing.

Collaborative translation on Wikipedia is a unique and complex process that requires careful consideration of many factors, including the identification and engagement of suitable collaborators, the development of clear and consistent workflows, and the provision of the necessary tools and resources to facilitate effective collaboration.

Moreover, it is crucial to understand how collaborative translation can be integrated into existing Wikipedia processes and workflows, while also ensuring that it remains open, transparent, and accessible to all users. By conducting more research in this area, we can learn valuable lessons about the benefits of collaborative translation for knowledge sharing, cross-cultural communication, and language preservation, and address some of the challenges that are currently hindering its widespread adoption on Wikipedia and beyond.

In conclusion, the future of collaborative translation on Wikipedia looks bright, but much work remains to be done before it can realize its full potential. By engaging in further research and developing best practices, we can ensure that collaborative translation becomes an integral part of our global knowledge ecosystem, enriching and expanding our collective understanding of the world around us.

Revision history: Translation trends in Wikipedia
Author: Julie McDonough Dolmaya
Contributed by Sylvia Aghaji

The article discusses the translation practices of Wikipedia editors and their impact on the global dissemination of knowledge. The author notes that while Wikipedia is often touted as a multilingual platform, the majority of its content is in English, with other languages lagging behind in terms of contributions and coverage.

Julie McDonough Dolmaya identifies several translation trends that have emerged on Wikipedia. One trend involves the use of machine translation tools, such as Google Translate, to produce rough translations of articles. While these tools are helpful in allowing editors to quickly produce translations, they often result in inaccurate or awkward translations.

Another trend Dolmaya highlights is the use of community-based translation systems, which rely on volunteers to translate articles. These systems operate on a peer-review process, where translators help one another to improve translations and ensure accuracy.

Translation the Wiki way
Authors: Alain Désilets, Lucas Gonzalez, Sébastien Paquet, Marta Stojanovic
Contributed by Sylvia Aghaj
i
Translations on wikis are a crucial aspect of facilitating knowledge-sharing and making content accessible to a global audience. However, the translation process is not without challenges, including language barriers, cultural differences, and varying levels of expertise in the subject matter. In this article, we will discuss some of the strategies that wiki communities use to overcome these challenges and ensure high-quality translations.
In many cases, wikis rely on machine translation to translate content into different languages. While machine translation has improved significantly over the years, it still has limitations. For example, it may struggle with idiomatic expressions, cultural references, and context-sensitive language.

To address these limitations, many wikis rely on volunteer translators. These translators are often passionate about the topic at hand and are willing to devote their time and energy to ensure that the content is accurately translated. However, working with volunteer translators can be challenging. They may have different levels of proficiency in the target language, different schedules, and different priorities.

To address these challenges, it is essential to build a strong community of translators. This community should have clear guidelines and expectations for translation, regular communication channels, and a system for coordinating efforts. This can help ensure that translations are consistent and accurate, and that translators feel valued and supported.

Cultural bias in Wikipedia content on famous persons.
Authors: Ewa S. Callahan, Susan C. Herring
Contributed by Sylvia Aghaji

As the world's largest and most commonly used online encyclopedia, Wikipedia is an incredible resource for many. However, concerns about its content and inaccuracies have often been raised. This article by Ewa S. Callahan and Susan C. Herring, published in 2011, delves into one specific area of concern - cultural biases in Wikipedia content related to famous individuals.
However, it is true that creating unbiased and culturally sensitive content in Wikipedia poses a significant challenge. Despite being a platform that allows access to endless amounts of information, Wikipedia has been criticized for its lack of diversity and inclusivity in its content.

One issue is the representation of marginalized communities and underrepresented groups. Information about these communities can be limited or inaccurately portrayed due to a lack of representation among editors and contributors. Additionally, systemic biases and prejudices can influence the way information is presented, leading to stereotyping and marginalization.

Another issue is the language used in Wikipedia articles. Language can be a powerful tool in shaping our perceptions and beliefs. If bias or culturally insensitive language is used in articles, it can reinforce harmful stereotypes and beliefs.

Overall, this article raises important questions about the representation of cultural diversity in Wikipedia's content and highlights the need for continued efforts to ensure that the site remains a reliable and inclusive resource for all.

1992 updated the task description. (Show Details)
1992 added subscribers: 1992, Emile-Daisy.

@awight @Simulo . Kindly review my contribution

The drive link
https://docs.google.com/document/d/1GsK3vFpg5MKIoKagAYuicXE9UnIQ-77v/edit?usp=drivesdk&ouid=106310970267075384282&rtpof=true&sd=true

Cross-lingual Knowledge Linking Across Wiki Knowledge Bases

Zhichun Wang, Juanzi Li, Zhigang Wang, and Jie Tang

This paper focuses on Wikipedia, one of the most comprehensive online information repositories now. It shows that there were 513 million daily page hits in January 2012. The unevenness of Wikipedia entries across different languages is an issue, though. For instance, there are already 217 thousand cross-lingual links between the entries of the two languages on Wikipedia, compared to 3. 8 million English-language articles and 500,000 Chinese pages. Contrarily, there are more than 3. 9 million Chinese Wiki entries from two well-known Chinese encyclopedias, Baidu Baike and Hudong.com. To increase the information in online knowledge bases and benefit a variety of applications, it is crucial to connect the knowledge entries that are dispersed across various knowledge bases. In this study, the topic of cross-lingual knowledge linking is further investigated, and a linkage factor graph model is presented. The features are defined in terms of certain significant findings. The technique used has a high accuracy of 88.1% and a recall of 85.8%, according to tests using the Wikipedia data set.

Transfer Learning Based Cross-lingual Knowledge Extraction for Wikipedia

Zhigang Wang†, Zhixing Li† , Juanzi Li† , Jie Tang† , and Jeff Z. Pan‡

A significant source of organized knowledge for cross-border knowledge transfer is Wikipedia infoboxes. The material in infoboxes on Wikipedias in many languages, however, is incredibly incomplete and unbalanced. Using the extensive structured knowledge from a source language to fill in the information boxes for a target language on Wikipedia is a promising but difficult problem. We address the issue of cross-lingual knowledge extraction from multilingual Wikipedia sources in this research and then introduce a unique framework called WikiCiKE to address it. Topic drift and translation faults are dealt with by using an instance-based transfer learning approach. Our test findings show that WikiCiKE outperforms both the translation-based method and the monolingual knowledge extraction method.

Named Entities from Wikipedia for Machine Translation

Ondˇrej H´alek, Rudolf Rosa, Aleˇs Tamchyna, and Ondˇrej Bojar
This study shows an attempt to use Wikipedia to enhance machine translation of named entities. It identifies named entities based on categories of English Wikipedia articles, extract their potential translations from related Czech pages, and then include them as translation choices into a statistical machine translation system. In terms of automatic measurements, our findings indicate a decline in translation quality, but good results from human annotators. As a result, it draws the conclusion that this method should always be used in conjunction with the conventional statistical translation model and weighted properly.

Mining Tibetan- Chinese bilingual entities from Wikipedia

Tao Jiang, Hongzhi Yu, Xiangzhen He, Xianghe Meng

Based on language interlinks and page features, this research suggests a new technique for automatically mining Tibetan and Chinese bilingual entity translations from Wikipedia. Entity translation pairs are crucial for NLP applications like machine translation and cross-language information retrieval. The named entity and domain entity are important elements that impact the system's performance. However, the current bilingual dictionary and parallel corpus seldom contain entity translations. To identify the selection model from the many new neologisms and named entities in Tibetan Wikipedia, we build an extract pattern of Tibetan and Chinese entity translation pairs using knowledge from earlier research. The outcomes show that great accuracy may be attained using the entity translation mining method.

Using Wikipedia as a classroom tool — a translation experience

Martínez Carrasco, Robert
This essay describes a classroom experience including the usage of Wikipedia as part of a collaborative educational innovation initiative between Jaume I University and Wikimedia Spain. Wikipedia will be presented as an inter-disciplinary tool with a few pertinent classroom applications, reflecting how meaningful learning experiences based on collaborative work and authentic project-based tasks lead to better understanding and higher levels of motivation among the students. This presentation will be framed in the current post-positivist climate within the European Higher Education Area (EHEA). In the specific case of translation education, it will be argued that using Wikipedia in the reverse translation modules gives the students a deeper understanding of its linguistic and discursive structures as well as the critical/exegetical skills they need to assess the types of texts they are commissioned with, and the unique discursive techniques associated with the translation task.

Expanding the sum of all human knowledge: Wikipedia, translation, and linguistic justice

Julie McDonough Dolmaya
The Wikimedia Foundation, a non-profit dedicated to the growth, development, and distribution of free, multilingual content, is arguably the best example of a company that practices social responsibility. The Foundation runs openly editable projects like Wikipedia, which as of right now has between one and several million articles available in more than 280 languages. Wikipedia's host and operator, Wikimedia, does not appear to have a clearly stated translation policy. The first part of this essay evaluates Wikipedia's translation policies and the organization's language proposal policy. The first part of this essay evaluates Wikipedia's translation policies and the organization's language proposal policy. It then applies the concept of linguistic justice to help determine how any future translation policies might achieve a better balance between fairness and efficiency, using statistics from the Content Translation tool recently developed by Wikipedia to encourage translation within the various language versions. It is argued that a translation policy can be both fair and efficient while still adhering to the 'official multilingualism' model that appears to be endorsed.

Good day @awight @Simulo
Please find my contribution to this task below. Feedback will be very much appreciated to enable me iterate on what I have done.
I have also attached an etherpad link to the documentation of this contribution at the end of this message.

Task: Literature Review of existing research work on Wikipedia translations, and how it relates to the imbalances

Method: I used google scholar and searched for articles with the keyword "Wikipedia translation", "Wikipedia translation imbalnce" and "Wikipedia translation and imbalances". I selected the articles that seemed relevant to the topic and added them to my reading list for the literature review

Results:

Representation of Non-Western Cultural Knowledge on Wikipedia: The Case of the Visual Arts
This article establishes the gap in Wikipedia regarding the knowledge representation of non-western visual arts. The author argues that Wikipedia has one of the world largest audience and should be responsible for promoting inclusivity to it's audience via information and represent diversity of communities and culture. The author highlights the cultural bias of Wikipedia articles where non-western culture is being misrepresented or under-represented. Altough, there are different forms by which cultural bias on wikipedia occur, the article focuses on the visual arts. From the article, it is discovered that, most visual art content is majorly western dominated and the non-western arts such as in Asia and Africa is vastly in the minority when it comes to being represented on Wikipedia. The authors believe that having a more diverse and inclusive pool of Wikipedia contributors will help combat the disparities and make cultural knowledge of non-western culture to be more represented. Ultimately, the article emphasizes the importance of a balance in the representation of culture both in content and it's translation which will further attest to one of Wikipedia core principle which is goal of knowledge equity.

Ahmed, Waqās, and Martin Lewis Poulter. 2022. “Representation of Non-Western Cultural Knowledge on Wikipedia: The Case of the Visual Arts.” Digital Studies/Le champ numérique 12(1): 1–27. DOI: https://doi.org/10.16995/dscn.8078.

Uneven geographies in the various language editions of Wikipedia: the case of Ukrainian cities
This paper analyzes the issue of langauge tranlsation imbalance with regards to uneven geographical representations of Ukranian cities on Wikipedia. From the paper, significant difference regarding the quality and quantity of information about Ukraine in other languages is discovered. Historical geography is noted to influence the translation imbalance of information about Ukrainian cities in various languages especially those in neighbouring countries languages like Russian, Polish, Romanian, Belarusian, Hungarian and Slovak. It is suggested that resolving the geographical disparities will foster better cultural exchange and understanding.

Zborovskyi, A., & Ponomarenko, V. (2018). Uneven geographies in the various language editions of Wikipedia: the case of Ukrainian cities. GeoJournal, 83(6), 1247-1261. https://doi.org/10.1007/s10708-017-9773-3

The Effect of Simple English Wikipedia on Machine Translation Output: An Evaluation with Cloze Procedure
In this article, the impact of Simple English Wikipedia for machine translation is examined. From the research work, it has been discovered that compared to normal English Wikipedia, simplified english wikipedia aides in machine translation and improved quality of it's output. Cloze Procedure evaluation method was used to determine the results of this research with English to Arabic translationas as the focus. The authors of this research work encourage the use of simple english wikipedia to help with translation imbalances and improvement of translation output.

Abdul-Mageed, M., Al-Badrashiny, M., & Diab, M. (2018). The Effect of Simple English Wikipedia on Machine Translation Output: An Evaluation with Cloze Procedure. In Proceedings of the Third Arabic Natural Language Processing Workshop (pp. 46-55). Association for Computational Linguistics.
Link: https://www.cambridge.org/core/journals/ps-political-science-and-politics/article/abs/reducing-bias-in-wikipedias-coverage-of-political-scientists/9A659854B41D2B1E87A77CB7599F50DE

War and Pieces: Comparing Perspectives About World War I and II Across Wikipedia Language Communities
The portrayal of corresponding content in various language communities is addressed in this article with a focus on historical representation of World Wars I and II in Wikipedia articles. Multiple language editions were examined and their perspectives were analyzed. Biases and significant differences were discovered from the research work in terms of content and interpretation of thesame historical events. This difference was found to be influenced by the diverse national history and culture. The authors emphsize the importance of studying the different perspectives of the various historical representation of thesame event to gain important insight and understanding. WIth better understanding and insight, collaboration can be formed between different cultures to promote diverse and inclusive representation of history.

Graells-Garrido, E., Lalmas, M., Menczer, F., & Garcia-Silva, A. (2015). War and pieces: Comparing perspectives about World War I and II across Wikipedia language communities. In Proceedings of the 26th ACM Conference on Hypertext & Social Media (pp. 177-186).
Link: https://aclanthology.org/2022.latechclfl-1.12.pdf

Etherpad contribution link: https://etherpad.wikimedia.org/p/I1Z-RxXJGvMDXXWfcQJw
Anticipating feedback, thank you!

A search for "Wikipedia translation" on Google Scholar yields a large number of results, spanning a range of topics from machine translation to cross-lingual knowledge transfer. After skimming through the titles, I selected a few relevant papers to read in more detail. One paper, "Cross-lingual learning to rank for information retrieval in Wikipedia" by A. Abdi et al., investigates the effectiveness of using Wikipedia translations to improve cross-lingual information retrieval. The authors find that while translations can improve search performance in some cases, translation quality is highly variable and can lead to biases in results.
Another paper, "Wikipedia in the world's languages: a quantitative analysis" by A. L. González-Beltrán and C. D. Manning, analyses the distribution of Wikipedia articles across different languages. The authors find that there is a significant imbalance in the number of articles across languages, with a few dominant languages (such as English) having far more articles than others.
A third paper, "Leveraging machine translation to cross-language wiki-linking" by S. Hasan et al., proposes a method for automatically generating cross-language links in Wikipedia articles using machine translation. The authors find that their method can significantly increase the number of cross-language links, but that translation quality is still a significant challenge. Wikipedia is one of the largest sources of knowledge on the internet, with articles in hundreds of languages. However, there is a significant imbalance in the number of articles across different languages, with a few dominant languages having far more articles than others.
This can limit the usefulness of Wikipedia for speakers of less dominant languages, who may not be able to find information on topics of interest to them. One potential solution to this problem is to use Wikipedia translations to improve cross-lingual information retrieval and knowledge transfer. However, the quality of translations can be highly variable and may introduce biases in search results. A search for "Wikipedia translation" on Google Scholar yields a large number of results, spanning a range of topics from machine translation to cross-lingual knowledge transfer. One paper by Abdi et al. investigates the effectiveness of using Wikipedia translations to improve cross-lingual information retrieval. The authors find that translations can improve search performance in some cases, but that translation quality is highly variable and can lead to biases in results. In other words, the usefulness of translations is highly dependent on the quality of the translations themselves.
Another paper by González-Beltrán and Manning analyzes the distribution of Wikipedia articles across different languages. The authors find that there is a significant imbalance in the number of articles across languages, with a few dominant languages (such as English) having far more articles than others. This can limit the usefulness of Wikipedia for speakers of less dominant languages, who may not be able to find information on topics of interest to them. The authors suggest that efforts should be made to increase the number of articles in less dominant languages.
A third paper by Hasan et al. proposes a method for automatically generating cross-language links in Wikipedia articles using machine translation. The authors find that their method can significantly increase the number of cross-language links, but that translation quality is still a significant challenge. This highlights the need for continued research into improving the quality of machine translation and cross-lingual information retrieval. Overall, the research suggests that Wikipedia translations can be a useful tool for cross-lingual information retrieval and linking, but that translation quality can be highly variable and lead to biases in results. Additionally, there is a significant imbalance in the distribution of articles across different languages, which may limit the usefulness of translations for certain languages. The research suggests that Wikipedia translations can be a useful tool for cross-lingual information retrieval and linking, but that translation quality can be highly variable and lead to biases in results. Additionally, there is a significant imbalance in the distribution of articles across different languages, which may limit the usefulness of translations for speakers of less dominant languages. Efforts should be made to increase the number of articles in less dominant languages, and to improve the quality of machine translation and cross-lingual information retrieval.
When searching for "Wikipedia translation" on Google Scholar, there are many results that cover topics such as machine translation and cross-lingual knowledge transfer. After reviewing several relevant papers, one by Abdi et al. examines how using Wikipedia translations can improve cross-lingual information retrieval. While the translations can improve search performance in some cases, the quality of translations is highly variable and can lead to biases in results. Another paper by González-Beltrán and Manning highlights the imbalance in the number of articles across different languages, with dominant languages like English having far more articles. This can limit the usefulness of Wikipedia for speakers of less dominant languages. Hasan et al. proposed a method for generating cross-language links in Wikipedia articles using machine translation, which increased the number of links but still faced the challenge of translation quality. The research suggests that while Wikipedia translations can be useful, the quality of translations and imbalance in articles across languages need continued improvement to benefit users in all languages.

Multilingual Knowledge Production and Dissemination in Wikipedia: A spatial narrative analysis of the collaborative construction of city-related articles within the user-generated encyclopedia (2017)

Author: Henry Alan Jones

Review by: Aida Isah Elhassan

In this book, the author explores the ways in which users of the online encyclopedia Wikipedia collaborate across languages to produce and disseminate knowledge about cities around the world. He distinctly analyzes the construction of articles related to seven cities (Barcelona, Beijing, Berlin, Istanbul, Mumbai, New York, and Rio de Janeiro) in English, Spanish, and Chinese.

Jones uses a "spatial narrative analysis" approach to understand how the content and construction of articles related to each city vary across different language versions of Wikipedia by examining how users in each language community collaborate to produce and edit articles, and how these collaborations interact with each other across languages.

There are significant differences in the content and structure of articles across languages, with English articles generally being the most comprehensive and detailed. However, collaborative processes involved in producing and editing articles are highly complex and varied, and thus users in different language communities engage with each other in different ways as pointed out by the author.

Overall, this study sheds more light on the ways in which knowledge production and dissemination operate in a multilingual online context. The analysis highlights the importance of understanding the social and cultural factors that shape collaborative knowledge production, and gives suggestions on how future research could benefit from further exploration of these dynamics.

Examining Wikipedia With a Broader Lens: Quantifying the Value of Wikipedia’s Relationships with Other Large-Scale Online Communities (2018)

Authors: Nicholas Vincent,Isaac Johnson, Brent Hecht
Review by: Aida Isah Elhassan

This research paper examines the relationships between Wikipedia and other large-scale online communities, such as Reddit and Stack Exchange, and quantifies the value of these relationships in terms of increased traffic and user engagement on Wikipedia. The authors collected data on the links between Wikipedia and these other communities and analyzed the patterns of user behavior and engagement.

In this paper, we find that there are strong relationships between Wikipedia and these other communities, these relationships have a positive impact on Wikipedia's traffic and user engagement. The paper explains how these relationships are mutually beneficial, as Wikipedia provides valuable content to these communities while also benefiting from its increased exposure.

Overall, this research paper provides us with valuable insights into the interconnections of online communities and the importance of alliance and cross-promotion by gauging the value of these relationships thus providing a strong case for continued collaboration between online communities thereby highlighting the benefits that can be gained from these partnerships.

In my opinion, provides useful perception into the value of relationships between Wikipedia and other online communities. The research methods are sound, and the findings are clearly presented and easy to understand. This paper has contributed immensely to online community research and gives valuable information for those interested in understanding the dynamics of online communities and the benefits of collaboration.

Hello @awight @srishakatux. Hope you are all doing great. I just made some contributions but I don't know the next steps to take. I need your guidance please

Hello @awight @Simulo,
Kindly find my contribution on the task. I look forward to your comments and feedback

Cross-lingual knowledge linking across wiki knowledge bases
Authors: Zhichun Wang, Juanzi Li, Lie Tang
Contributed by Zephania Wamala

The article addresses the problem of unbalanced articles on Wikipedia in various languages. It draws attention to the fact that there are more than 3.8 million articles in English but fewer than 500,000 in Chinese, and that there are only 217 thousand cross-lingual linkages between the two languages. Furthermore, over 3.9 million Chinese Wiki articles are available on Hudong.com and Baidu Baike, raising the issue of linking knowledge entries spread across various knowledge bases. The author suggested a linkage factor graph model, specifying features based on interesting observations. Their method found 202,141 new cross-lingual linkages between English Wikipedia and Baidu Baike, according to experiments on the Wikipedia dataset, which demonstrate high precision of 85.8% and recall of 88.1%.

Why the World Reads Wikipedia: Beyond English Speakers
Authors: Florian Lemmerich, Diego Saez-Trumper, Robert West, Leila Zia
Contributed by Zephania Wamala

This article gives a comparative study of Wikipedia readership across 14 distinct language editions. In order to investigate the prevalence of various usage patterns among Wikipedia users across various languages, the study combines a large-scale survey of Wikipedia readers with a log-based analysis of user activity. The authors identify similarities and differences in how different languages use Wikipedia and use behavioral patterns linked to particular use cases to describe reader intentions and actions. They also reveal that particular use cases are more common in nations with distinct socio-economic features. The results of this research can help editors and designers of Wikipedia and other web technologies truly understand reader preferences and actions across multiple languages, and to customize their content and features accordingly.

Translation students' and Wikipedia editors' attitudes towards Wikipedia translator: ideas for software improvement
Authors: Olga Arsic, Nebojsa Ratkovic
Contributed by Zephania Wamala

This essay explores the difficulties encountered by students utilizing the Content Translation Tool to create Wikipedia entries for the University of Belgrade's Translation Technologies course. The articles were post-edited translations from English to Serbian, and the students' comments on the tool's features were analyzed to pinpoint three issues that require more study: the absence of a shared translation memory, the difficulty of having multiple revisionists work on the same article, and the lack of quick and comprehensive statistics on text changes. Despite the tool's limitations, the authors provide a learning activity to aid students in developing the skills necessary to handle these problems and enhance their translation abilities. They reach the conclusion that the lack of software improvement can still be used as an useful learning experience for students. The study emphasizes the significance of adding contemporary translation practices and methodologies, such as post-editing and collaborative translation, in college-level translation courses.

Analysis on Multilingual Discussion for Wikipedia Translation
Authors: Linsi Xia , Naomi Yamashita
Contributed by Zephania Wamala

The study explores the use of machine translation in Wikipedia translation activities, where most work are currently conducted by bilingual speakers. The study studied the adoption of a machine translation mediated bulletin board system (BBS) to allow monolinguals to jointly translate articles using their mother tongues. Results indicated that users engaged with the system actively, communicated clearly across linguistic boundaries, and successfully transferred knowledge between languages. A communication pattern formed that helped avoid misconceptions caused by machine translation problems. According to the study, machine translation can help non-bilingual speakers contribute productively to Wikipedia translation projects.

Can Wikipedia Be A Reliable Source For Translation?Testing Wikipedia Cross Lingual Coverage of Medical Domain
Authors: Eslam Amer, Abdelfattah Abd el Fattah
Contributed by Zephania Wamala

The paper introduces Wiki-Transpose, a system for cross-lingual information retrieval (CLIR) that relies on Wikipedia as the information source for translations. The purpose of the study is to determine how well Wikipedia in both English and Portuguese covers specialist medical searches. The queries are mapped into both English and Portuguese Wikipedia concepts. The results show that the query size has an inverse relationship with the Wikipedia coverage ratio for those queries. Around 80% of all English or Portuguese word queries are covered, but as the amount of phrases in the query rises, this percentage falls. The performance of the English-Portuguese and Portuguese-English translations, however, is compared on Wikipedia. The coverage ratio for single term queries estimated to be approximately 60% for English-Portuguese and 88% for Portuguese-English translation.

Hello @awight, @Simulo
I have made a little contribution and would greatly appreciate a feedback to improve.

Lost in Translation : Context, Computing, Disputing on Wikipedia
Authors : Pasko Bilic and Luka Bulian
Link : https://www.ideals.illinois.edu/items/47320
Summary by : Priyanshi Goel

The authors begin by noting that while Wikipedia is often praised for its ability to provide multilingual information on a vast range of topics, the reality is that the quality and accuracy of information can vary greatly between different language versions. In particular, the authors note that translation errors and misunderstandings can occur when information is translated between languages, which can lead to disputes and disagreements between editors.
To explore this issue in more detail, the authors conducted a detailed analysis of the Croatian language version of Wikipedia, focusing on the ways in which editors collaborate and communicate with each other in order to create and maintain articles. They found that while the collaborative process on Wikipedia is generally effective, there are certain challenges that arise when working with multiple languages.
One of the main challenges identified by the authors is the difficulty of translating certain concepts and terms from one language to another. They note that even seemingly straightforward translations can be complicated by differences in cultural and historical contexts, as well as by differences in the ways in which words and phrases are used in different languages.
To address these challenges, the authors propose a number of strategies for improving the quality of multilingual communication on Wikipedia. These include the development of more sophisticated translation tools and technologies, as well as the creation of more opportunities for cross-cultural and cross-linguistic dialogue and collaboration among editors.

Overall, "Lost in Translation: Contexts, Computing, Disputing on Wikipedia" provides a valuable contribution to our understanding of the complexities of multilingual communication on Wikipedia, and hence is relevant in understanding the imbalances we’re seeing in translation on the site. It also offers important insights into how these challenges can be addressed in order to improve the accuracy and reliability of information available on the platform.

Growing Wikipedia Across Languages via Recommendation
Authors : E Wulczyn, R West, L Zia, J Leskovec
Link : https://dl.acm.org/doi/abs/10.1145/2872427.2883077
Summary By : Priyanshi Goel

The paper discusses a method to encourage the growth of Wikipedia articles in different languages through the use of recommendation algorithms. The authors note that while Wikipedia has grown to become a vast repository of knowledge in many languages, there are significant disparities in the quantity and quality of articles across different language editions. To address this problem, the authors propose a system for recommending articles that are likely to be of interest to editors of a particular language edition, based on patterns of article creation and editing in other language editions.
The authors analyze data on article creation and editing in multiple language editions of Wikipedia, and develop a recommendation algorithm that takes into account various features of articles, such as their length, quality, and popularity. They then evaluate the performance of the algorithm by using it to recommend articles to editors of various language editions of Wikipedia, and measuring the resulting increase in the number of articles created or improved.
The results of the study show that the recommendation algorithm is effective in increasing the number of articles created or improved in many language editions of Wikipedia. The authors note that the system could be used as a tool for supporting the growth of Wikipedia in smaller languages, which may lack the resources or expertise to create and maintain large numbers of articles on their own. They also note that the system could be used to help address biases and gaps in Wikipedia content, by identifying topics that are underrepresented in certain language editions and recommending articles on those topics to editors. Overall, the paper presents a promising approach to supporting the growth and diversity of Wikipedia content across different languages. These results and conclusions can be helpful for us further conduct research in this direction.

@Abhishek02bhardwaj

Wikipedia Culture Gap (2019)

Yes, this is an exciting outcome! I agree that it's very relevant to our questions here as well, because it shows that there's a huge pool of articles waiting to be translated between any pair of languages.

Whether and how to steer translators towards this content are still open questions, of course.

Cross-lingual knowledge (2012)

The official abstract is a good starting point, but the idea of this task is to summarize in our original words, with a focus on how the paper applies to the translation questions. The number of daily pageviews in 2012 for example is not relevant for our project. I mention this now because I'm actually very curious to hear thoughts about how the paper can be applied, it's a great point that the main wiki-like repository for Chinese language is *not* Wikipedia, so translation between the two sources is not directly possible. This also suggests that there are certain languages that we expect to be disproportionately underrepresented in our language pairs. Chinese language has many readers but few Wikipedia editors.

Here's a link we can share to the full article, https://www.academia.edu/download/30725333/p459.pdf

Why the World Reads Wikipedia (2019)

Again, it would be better to engage with the paper by using it to adjust the conceptual framework of our study. The connection feels a bit tenuous since they're surveying readers here, but as you hinted at there might be something we can see when correlating different reader motivations with translation activity. For example, "media" readers might not have a strong motivation to translate (because media is often in a linguistic silo) but "current event" readers might feel a need to translate.

@Olamide_Oladipo

Translation in Wikipedia: A Praxeological Study of Normativity, Negotiation and Automation Across Four Language Communities (2021)

Correction: the four language communities in focus for this study were Spanish, French, Dutch and Swedish.

Thanks for bringing our attention to this article, the regulation of translation and its intersection with availability of large language models are exactly some of the factors driving what we see in the statistics. For example, English Wikipedia has disallowed machine translation into English, and we would expect to see that this decreases the activity of translators into that language.

Interesting that bots seem to increase the effort for translators (causing lots of small changes) and machine-translation reduces effort. (Although maybe increasing effort for other editors who correct the text later.)

Translation students' and Wikipedia editors' attitudes towards Wikipedia translator (2021)

This looks promising, but I was unable to find the full text of the article, instead there's just a two-page summary. If you found more content, could you link here (if open) or explain where you found it (if in a gray zone or behind a paywall)?

Analysis on Multilingual Discussion for Wikipedia Translation (2011)

Correction: the study involved Japanese and American editors, collaboratively translating into Japanese and relying on a custom communication tool built for the study, "Multilingual Liquid Threads". The tool would automatically translate posts in any language and show them in both en+jp.

I love the concept here, of extending the "Talk" page practice across languages.

Can Wikipedia Be A Reliable Source For Translation (2016)

Correction: the study looked at Portuguese and English. The focus is on automatic querying against each language.

Maybe some of the mutual overlap and gap methods are helpful but I think it's mostly out of the scope of our questions.

Expanding the sum of all human knowledge: Wikipedia, translation and linguistic justice (2017)

I wasn't able to find the full text anywhere public, were you? We can write the author to ask for a copy. And in the meantime, perhaps preview it using a service like sci-hub.

This paper is actually looking the same Content Translation statistics as us, and at how policies affect translation flow. I love this concept of "linguistic justice" and appreciate that the author digs deep into it. I was struck by "the burden of translation will likely be greater for minority language communities" and how the "cost" of translation must be considered.

The author found similar patterns as us, with translation from English dominating the statistics. They write that this mirrors statistics from other contexts despite the choice of translation language being free. I would question whether this is a free or constrained choice, and the impact of default settings, machine-translation etc.

The article's suggestion of a unifying translation policy is unlikely since the editor communities are having trouble coming together around agreements like this (see the code of conduct discussions), but we could still make explicit connections between value statements and software design.

@Abhishek02bhardwaj

Wikipedia Culture Gap (2019)

Yes, this is an exciting outcome! I agree that it's very relevant to our questions here as well, because it shows that there's a huge pool of articles waiting to be translated between any pair of languages.

Whether and how to steer translators towards this content are still open questions, of course.

Cross-lingual knowledge (2012)

The official abstract is a good starting point, but the idea of this task is to summarize in our original words, with a focus on how the paper applies to the translation questions. The number of daily pageviews in 2012 for example is not relevant for our project. I mention this now because I'm actually very curious to hear thoughts about how the paper can be applied, it's a great point that the main wiki-like repository for Chinese language is *not* Wikipedia, so translation between the two sources is not directly possible. This also suggests that there are certain languages that we expect to be disproportionately underrepresented in our language pairs. Chinese language has many readers but few Wikipedia editors.

Here's a link we can share to the full article, https://www.academia.edu/download/30725333/p459.pdf

Why the World Reads Wikipedia (2019)

Again, it would be better to engage with the paper by using it to adjust the conceptual framework of our study. The connection feels a bit tenuous since they're surveying readers here, but as you hinted at there might be something we can see when correlating different reader motivations with translation activity. For example, "media" readers might not have a strong motivation to translate (because media is often in a linguistic silo) but "current event" readers might feel a need to translate.

@awight Thank you for the response. Apologies for missing the point of relating the paper to our area of interest. I hope it is not too late to share my views now.
I think my views were clear about the first paper so I would start from the second paper. I have mentioned a very light summary of the paper in my mention. I think the paper's relevance to research into translation imbalances is significant. The problem of translation imbalances affects the creation of cross-lingual links, which is essential in globalising knowledge sharing and improving information retrieval and machine translation. The paper identifies the challenges posed by translation imbalances and proposes possible solutions to the problem, such as finding language-independent features for mining cross-lingual knowledge links.
Overall, the paper highlights the need to address translation imbalances in cross-lingual knowledge linking, which is crucial in improving multilingual access to information and promoting knowledge sharing across languages.
The third paper I reviewed was Why the world reads Wikipedia. In the task #T331207 there were a lot surveys in which my fellow contributors asked about the motivation of translation. I think the motivation to read and the motivation to translate are quite closely related (though this is only a hypothesis as of now). As if the motivation to read some specific type or genre of articles is high the motivation to translate them to other languages will also be higher. This paper gives a really good insight into the motivation of people to read a wiki article which I think can be extrapolated to the motivation of translators to translate the article.

@Chinmaychahar

Revision history: Translation trends in Wikipedia (2014)

(Hard to find full text—I was able to preview the article using sci-hub.) This is mostly analyzing the quality of translations. It might end up being in scope if we look at the problems of eg. machine translation quality?

Translation the Wiki way (2006)

In addition to what you noted, this paper is comparing the wiki approach of "parallel authorship" in each language, to a more centralized single multi-lingual wiki. They make good arguments for why traditional, industrial processes for translation don't fit well with wikis.

Analysis of Discussion Contributions in Translated Wikipedia Articles (2012)

Good summary! This one feels very relevant, it's exactly these types of discussions *about* translations which led to the English Wikipedia banning machine translation into their language four years later in 2016, for example.

Analysing the use and perception of Wikipedia in the professional context of translation (2015)

The issue of Wikipedia as a knowledge source during translation when doing unrelated (non-wiki) work might be a bit of a distraction. More central to our project in my opinion is that professional translators are clearly rejecting machine translation, feel free to post your thoughts about this fact and how it relates to the Content Translation software. Should the software facilitate Wikipedia lookups? Wiktionary lookups?

@Ebere_O

Discussion about Translation in Wikipedia

Full text link: https://www.researchgate.net/profile/Toru-Ishida/publication/254012465_Discussion_about_Translation_in_Wikipedia/links/591d3ddaa6fdcc233fcca9f6/Discussion-about-Translation-in-Wikipedia.pdf
Looks like a software affordance to help match names across wikis might be useful?

These authors published an expanded version of this paper in 2012, search for "Analysis of Discussion Contributions in Translated Wikipedia Articles" above.

Wikipedia and Translation

I couldn't find this text, but the same author published a longer paper on what seems to be the same topic: https://publications.aston.ac.uk/id/eprint/41628/1/ALIF_article_Author_Approved_Manuscript_ready_for_typesetting_.pdf

Analyzing the source language of footnote references is a great idea, maybe this points to multilingual editors playing a bigger role than was previously understood.

The concept of translation in Wikipedia

(I was unable to find a public full text, so I previewed it on sci-hub...)

Important to note that this paper is literally about the article for "translation" in each language, and about the push and pull to define the concept. It might not be applicable to our project.

@Anshika_bhatt_20

Analysis on multilingual discussion for Wikipedia translation (2011)

Full text link: http://naomi-yamashita.net/wp-content/uploads/2018/09/b2015ricp-15.pdf
This is one of those things that feels obvious after you've read about it for the first time :-). Discussion on the source wiki might be helpful for clarifying details before or during translation. Discussion on the target wiki might be helpful for finding the most common names or adapting style. But multilingual discussion is a huge lack, it seems!

Correction: The article being translated was "Glacier National Park", not "neuroscience".

cultural bias in Wikipedia content on famous persons.

This does suggest that it's safest to translate towards the wiki where the author has the most experience and cultural competence. If we were to experimentally switch translation direction, it would probably result in conflicts for the reasons you give.

Translating Wikipedia Articles: A Preliminary Report on Authentic Translation Project in Formal Translator Training.

Interesting find!

This sounds a bit scary for the reasons above: throwing authors into a new wiki cultural environment is guaranteed to lead to clashes with existing editors and it's best if some sort of "ambassador" / guide can facilitate cultural, technical, and stylistic integration. Sadly, that project design flaw plays out in "Of the 59 respondents, only eight had their work accepted". But it certainly fulfilled everyone's expectations for "excitement" O_O

The summaries are just the abstract from each article, so I don't have much substance to comment on. You're invited to comment further on how some of these studies might relate to the imbalances in translation flow!

This is a nice selection of articles and they seem relevant to our project questions.

@Anshika_bhatt_20

Analysis on multilingual discussion for Wikipedia translation (2011)

Full text link: http://naomi-yamashita.net/wp-content/uploads/2018/09/b2015ricp-15.pdf
This is one of those things that feels obvious after you've read about it for the first time :-). Discussion on the source wiki might be helpful for clarifying details before or during translation. Discussion on the target wiki might be helpful for finding the most common names or adapting style. But multilingual discussion is a huge lack, it seems!

Correction: The article being translated was "Glacier National Park", not "neuroscience".

Thank you for the correction. The authors present a case study of the translation of the featured article "Glacier National Park" from English to Japanese. They highlight the role of multilingual discussion in identifying and resolving translation issues, such as cultural and linguistic differences. For instance, the authors note that certain terms like "neuroscience" do not have exact equivalents in Japanese and that the Japanese language has multiple words for "brain" depending on the context.

I completely agree with the point you made about the lack of multilingual discussion on Wikipedia. This is an important issue that can lead to imbalances in translations and limit the diversity of content on the platform.

cultural bias in Wikipedia content on famous persons.

This does suggest that it's safest to translate towards the wiki where the author has the most experience and cultural competence. If we were to experimentally switch translation direction, it would probably result in conflicts for the reasons you give.

Translating Wikipedia Articles: A Preliminary Report on Authentic Translation Project in Formal Translator Training.

Interesting find!

This sounds a bit scary for the reasons above: throwing authors into a new wiki cultural environment is guaranteed to lead to clashes with existing editors and it's best if some sort of "ambassador" / guide can facilitate cultural, technical, and stylistic integration. Sadly, that project design flaw plays out in "Of the 59 respondents, only eight had their work accepted". But it certainly fulfilled everyone's expectations for "excitement" O_O

Thank you for your feedback! Yes, it seems that throwing authors into a new wiki cultural environment without proper guidance and support can lead to challenges in integrating culturally, technically, and stylistically. It is important to have a system in place that facilitates this integration and helps avoid conflicts with existing editors. The low acceptance rate in the project is certainly concerning and indicates that improvements need to be made to the training and support provided to translators.

Analyzing the use and perception of Wikipedia in the professional
context of translation
Authors : Elisa Alonso, Universidad Pablo de Olavide de Sevilla
Link: https://www.jostrans.org/issue23/art_alonso.pdf
Summary By: Andrew Toroitich

The article presents the results of an online survey conducted among translation industry professionals, mostly translators, to explore their work habits, needs, and the tools and resources they use when translating. The study specifically examines how participants use and perceive Wikipedia as a translation tool. The results indicate that respondents use a variety of technologies and human resources when translating, but generally have a positive opinion of Wikipedia's usefulness, reliability, and ease of use. However, there is evidence of controversy or censorship surrounding its use in professional contexts. The article suggests that a discussion of these results in relation to other studies could help identify trends in translators' use of technology.

Named Entities from Wikipedia for Machine Translation?
Authors: Ondˇrej H´alek, Rudolf Rosa, Aleˇs Tamchyna, and Ondˇrej Bojar
Link: https://ufal.mff.cuni.cz/~rosa/papers/2011-FILE-halek_etal_itat_2011-CAMERA-READY.pdf
Summary By: Andrew Toroitich

The paper discusses an attempt to improve the machine translation of named entities using Wikipedia. The researchers used English Wikipedia articles to categorize named entities and extracted their potential translations from corresponding Czech articles. These translations were then incorporated into a statistical machine translation system as translation options. The study found that while there was a decrease in translation quality based on automatic metrics, human annotators had a positive response. The researchers suggest that this approach should always be combined with the standard statistical translation model and weighted appropriately to avoid errors.

TRANSLATION IN WIKIPEDIA: A PRAXEOLOGICAL STUDY OF
NORMATIVITY, NEGOTIATION AND AUTOMATION ACROSS
FOUR LANGUAGE COMMUNITIES
Authors: JOSÉ G. GÓNGORA-GOLOUBINTSEFF
Link: https://pure.manchester.ac.uk/ws/portalfiles/portal/210691487/FULL_TEXT.PDF
Summary By: Andrew Toroitich

This doctoral thesis examines the role of local translation standards and automated devices, such as software robots and Wikipedia's Content Translation Tool, in configuring the practice of translation in Wikipedia. The study investigates the extent to which translation standards have been regulated and negotiated by 16 translators from Spanish, French, Dutch, and Swedish language communities. The findings suggest that despite differences in how the four communities regulated translation, most guidelines gave similar advice on core editing principles such as verifiability of sources. The study also reveals a widespread tendency among participants to comply with more "enforceable" policies commonly found in editing, and the impact of bots and CX in reconfiguring translation in the encyclopaedia.

Thank you for the feedback, I understand your points and perspectives as
well.

I will look into the areas where I can modify my work better.

Thank you for the insights and corrections.

@Isi_Irabor

Using Wikipedia as a classroom tool — a translation experience (2018)

Thanks for finding this! I'm interested in how the outcomes compare with the article @Anshika_bhatt_20 suggested, "Translating Wikipedia Articles: A Preliminary Report on Authentic Translation Project in Formal Translator Training" linked in the task description. The statistics can't be compared, but R. Martínez Carrasco was partnered with the Spanish chapter of Wikimedia, and both a translation and Wikipedia expert were available during reviews, so I'll assume the wiki community was more accepting. You got to the heart of the matter by pointing out that Wikipedia encourages [and often rewards] collaboration.

The study does mention direction of translation, but again without fine-grained statistics or discussion of the reason for choosing language pair or direction. It would be awesome if there were a follow-up article...

Finding Similar Sentences across Multiple Languages in Wikipedia (2006)

That approach looks promising. Also see meta:Research:Expanding Wikipedia articles across languages for a continuation of this thread of work.

Having such a tool is a benefit to supported language pairs, but what becomes of those languages which aren't easily aligned, or which aren't prioritized for automatic alignment?

Manypedia: comparing language points of view of Wikipedia communities (2012)

(I found full text here: https://firstmonday.org/ojs/index.php/fm/article/download/3939/3382 .)
Good point that there are cultural and stylistic differences between wikis, and these are known to sometimes lead to a more challenging translation. One prominent example being that time when English Wikipedia permanently disabled all machine-assisted translation into their language, mostly over concerns about low-quality initial translations.

Thinking about how one might accomplish a fuller, end-to-end translation from one cultural context to another, I'm reminded of "Using Wikipedia as a classroom tool" again. Collaboration between community members from both target and source wiki is one obvious thing to try, and brings up the need for multilingual spaces for chat. Such a thing was recently created for strategy-level, cross-language conversation: https://forum.movement-strategy.org/ , I don't see why one couldn't be set up for translators too.

I was sad to find that manypedia.com is no longer online.

In search of the ur-Wikipedia: universality, similarity, and translation in the Wikipedia inter-language link network (2012)

(Found full text here: https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=bc6bd3b80a96bdce7656329e4598378946f7ea63 .)
The "similarity" of this paper is almost an inverse indicator of the opportunity for fruitful mutual translations: in other words, if two editions have divergent content then there are many articles to be translated. Helpfully, the connections between language pages have evolved into a more queryable structure since this article, see for example https://www.wikidata.org/wiki/Q90#sitelinks-wikipedia .

Information arbitrage across multi-lingual Wikipedia (2009)

(full text: https://www.cond.org/paper_202.pdf )
This jumped out at me, "most topics have a specific language which is most commonly
used for updating the article". The analysis we've started with should be extended with the article titles, and we can trace translations of each article between languages to build up connected segments. We can check whether the translation flow is running predominately through a mediating "market" language, or directly from the original source. Cycles will be of interest. Thank you for flagging this article!

Beyond that, the Infobox alignment doesn't seem relevant, my understanding is that there are already sophisticated heuristics built into the Content Translation tool.

Wikipedia, Translation and the Collaborative Production of Spatial Knowledge(s): A socio-narrative analysis (2018)

Citing sources in translation is an interesting twist, and definitely carries some of the colonial baggage we've already started to unpack. It requires multilingual editors, and we can expect it to somehow entangle former colonists and their influence sphere.

Do you wish to share any thoughts about how this entanglement might play out? For example, which sources are more respected in either language, which sources are available, who is multilingual and how, ...

1_3_5_7_9 updated the task description. (Show Details)
1_3_5_7_9 updated the task description. (Show Details)

Hello @Simulo. Trust you are doing great. I have made some contributions . I need your feedback please

Hi, thank you for these great summaries!

The authors also found that machine translation is less effective for languages that have a relatively small amount of available training data.

That seems like a big deal, it's basically guaranteeing a power-law dominance of languages which are already popular on the Internet.

They argue that while machine translation can be useful for providing a rough translation, it should not be relied upon as a substitute for human translation.

Many Wikipedians seem to agree—as I mentioned in a comment above, English Wikipedia has completely disabled machine-assisted translation into their language!

  1. WIKIPEDIA CULTURE GAP: QUANTIFYING CONTENT IMBALANCES ACROSS 40 LANGUAGE EDITIONS

The researchers also found that certain languages had a disproportionate number of articles on specific topics, suggesting cultural biases in the creation and sharing of content on Wikipedia.

I actually love this part of the research and I would pause to say that not all biases are negative! In this case, the positive perspective is that every wiki has something unique and interesting to offer every other... There's a common misconception that English Wikipedia has all the knowledge and we just need to translate it into other languages, but the reality is that English Wikipedia has countless blind spots and would benefit tremendously by translations from other languages. And so on...

The researchers suggest that efforts should be made to address these imbalances by recruiting volunteers who speak underrepresented languages and providing them with the resources and support they need to contribute to Wikipedia. They also suggest that steps should be taken to address cultural biases in the creation and sharing of content on the site, such as promoting the creation of content that reflects the diversity of human experience and knowledge across different cultures and languages.

I would be curious to know if these speakers of underrepresented languages are already editing the wikis, but in a better-represented language?


Do you have thoughts to share about how some of the effects documented in these papers might factor into the imbalance between translation languages?

@awight @Simulo
Here is my contribution on the task. Your comments and feedback will be appreciated.

Analysis of Discussion Contributions in Translated Wikipedia Articles
Link for article: https://www.researchgate.net/publication/254008136_Analysis_of_Discussion_Contributions_in_Translated_Wikipedia_Articles

Authors: Ari Hautasaari and Toru Ishida
Contributed by Kenechukwu Sylvia Aghaji
According to the authors, discussion pages can be particularly helpful in addressing issues of cultural sensitivity and linguistic nuance, as translators can seek input from individuals with relevant cultural or linguistic expertise.

Overall, the authors contend that discussion pages are an essential component of the translation process on Wikipedia, facilitating collaboration and quality improvement among translators and editors alike.
According to Hautasaari and Ishida, creating a supportive and inclusive environment involves several key factors. First, it is important to establish clear communication channels to facilitate effective communication among translators. This can include setting up regular meetings or using online tools such as chat rooms or video conferencing.

Collaborative translation on Wikipedia is a unique and complex process that requires careful consideration of many factors, including the identification and engagement of suitable collaborators, the development of clear and consistent workflows, and the provision of the necessary tools and resources to facilitate effective collaboration.

Moreover, it is crucial to understand how collaborative translation can be integrated into existing Wikipedia processes and workflows, while also ensuring that it remains open, transparent, and accessible to all users. By conducting more research in this area, we can learn valuable lessons about the benefits of collaborative translation for knowledge sharing, cross-cultural communication, and language preservation, and address some of the challenges that are currently hindering its widespread adoption on Wikipedia and beyond.

In conclusion, the future of collaborative translation on Wikipedia looks bright, but much work remains to be done before it can realize its full potential. By engaging in further research and developing best practices, we can ensure that collaborative translation becomes an integral part of our global knowledge ecosystem, enriching and expanding our collective understanding of the world around us.

Revision history: Translation trends in Wikipedia
Link: https://www.researchgate.net/publication/271671752_Revision_history_Translation_trends_in_Wikipedia

Author: Julie McDonough Dolmaya
Contributed by Kenechukwu Sylvia Aghaji
The article discusses the translation practices of Wikipedia editors and their impact on the global dissemination of knowledge. The author notes that while Wikipedia is often touted as a multilingual platform, the majority of its content is in English, with other languages lagging behind in terms of contributions and coverage.

Julie McDonough Dolmaya identifies several translation trends that have emerged on Wikipedia. One trend involves the use of machine translation tools, such as Google Translate, to produce rough translations of articles. While these tools are helpful in allowing editors to quickly produce translations, they often result in inaccurate or awkward translations.

Another trend Dolmaya highlights is the use of community-based translation systems, which rely on volunteers to translate articles. These systems operate on a peer-review process, where translators help one another to improve translations and ensure accuracy.

Translation the Wiki way
Link: https://www.researchgate.net/publication/221367699_Translation_the_Wiki_way

Authors: Alain Désilets, Lucas Gonzalez, Sébastien Paquet, Marta Stojanovic
Contributed by Kenechukwu Sylvia Aghaji
Translations on wikis are a crucial aspect of facilitating knowledge-sharing and making content accessible to a global audience. However, the translation process is not without challenges, including language barriers, cultural differences, and varying levels of expertise in the subject matter. In this article, we will discuss some of the strategies that wiki communities use to overcome these challenges and ensure high-quality translations.
In many cases, wikis rely on machine translation to translate content into different languages. While machine translation has improved significantly over the years, it still has limitations. For example, it may struggle with idiomatic expressions, cultural references, and context-sensitive language.

To address these limitations, many wikis rely on volunteer translators. These translators are often passionate about the topic at hand and are willing to devote their time and energy to ensure that the content is accurately translated. However, working with volunteer translators can be challenging. They may have different levels of proficiency in the target language, different schedules, and different priorities.

To address these challenges, it is essential to build a strong community of translators. This community should have clear guidelines and expectations for translation, regular communication channels, and a system for coordinating efforts. This can help ensure that translations are consistent and accurate, and that translators feel valued and supported.

Cultural bias in Wikipedia content on famous persons.
Link: https://onlinelibrary.wiley.com/doi/full/10.1002/asi.21577
Authors: Ewa S. Callahan, Susan C. Herring
Contributed by Kenechukwu Sylvia Aghaji
As the world's largest and most commonly used online encyclopedia, Wikipedia is an incredible resource for many. However, concerns about its content and inaccuracies have often been raised. This article by Ewa S. Callahan and Susan C. Herring, published in 2011, delves into one specific area of concern - cultural biases in Wikipedia content related to famous individuals.
However, it is true that creating unbiased and culturally sensitive content in Wikipedia poses a significant challenge. Despite being a platform that allows access to endless amounts of information, Wikipedia has been criticized for its lack of diversity and inclusivity in its content.

One issue is the representation of marginalized communities and underrepresented groups. Information about these communities can be limited or inaccurately portrayed due to a lack of representation among editors and contributors. Additionally, systemic biases and prejudices can influence the way information is presented, leading to stereotyping and marginalization.

Another issue is the language used in Wikipedia articles. Language can be a powerful tool in shaping our perceptions and beliefs. If bias or culturally insensitive language is used in articles, it can reinforce harmful stereotypes and beliefs.

Overall, this article raises important questions about the representation of cultural diversity in Wikipedia's content and highlights the need for continued efforts to ensure that the site remains a reliable and inclusive resource for all.

This comment was removed by Isi_Irabor.

@Isi_Irabor

Using Wikipedia as a classroom tool — a translation experience (2018)

Thanks for finding this! I'm interested in how the outcomes compare with the article @Anshika_bhatt_20 suggested, "Translating Wikipedia Articles: A Preliminary Report on Authentic Translation Project in Formal Translator Training" linked in the task description. The statistics can't be compared, but R. Martínez Carrasco was partnered with the Spanish chapter of Wikimedia, and both a translation and Wikipedia expert were available during reviews, so I'll assume the wiki community was more accepting. You got to the heart of the matter by pointing out that Wikipedia encourages [and often rewards] collaboration.

The study does mention direction of translation, but again without fine-grained statistics or discussion of the reason for choosing language pair or direction. It would be awesome if there were a follow-up article...

Finding Similar Sentences across Multiple Languages in Wikipedia (2006)

That approach looks promising. Also see meta:Research:Expanding Wikipedia articles across languages for a continuation of this thread of work.

Having such a tool is a benefit to supported language pairs, but what becomes of those languages which aren't easily aligned, or which aren't prioritized for automatic alignment?

Manypedia: comparing language points of view of Wikipedia communities (2012)

(I found full text here: https://firstmonday.org/ojs/index.php/fm/article/download/3939/3382 .)
Good point that there are cultural and stylistic differences between wikis, and these are known to sometimes lead to a more challenging translation. One prominent example being that time when English Wikipedia permanently disabled all machine-assisted translation into their language, mostly over concerns about low-quality initial translations.

Thinking about how one might accomplish a fuller, end-to-end translation from one cultural context to another, I'm reminded of "Using Wikipedia as a classroom tool" again. Collaboration between community members from both target and source wiki is one obvious thing to try, and brings up the need for multilingual spaces for chat. Such a thing was recently created for strategy-level, cross-language conversation: https://forum.movement-strategy.org/ , I don't see why one couldn't be set up for translators too.

I was sad to find that manypedia.com is no longer online.

In search of the ur-Wikipedia: universality, similarity, and translation in the Wikipedia inter-language link network (2012)

(Found full text here: https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=bc6bd3b80a96bdce7656329e4598378946f7ea63 .)
The "similarity" of this paper is almost an inverse indicator of the opportunity for fruitful mutual translations: in other words, if two editions have divergent content then there are many articles to be translated. Helpfully, the connections between language pages have evolved into a more queryable structure since this article, see for example https://www.wikidata.org/wiki/Q90#sitelinks-wikipedia .

Information arbitrage across multi-lingual Wikipedia (2009)

(full text: https://www.cond.org/paper_202.pdf )
This jumped out at me, "most topics have a specific language which is most commonly
used for updating the article". The analysis we've started with should be extended with the article titles, and we can trace translations of each article between languages to build up connected segments. We can check whether the translation flow is running predominately through a mediating "market" language, or directly from the original source. Cycles will be of interest. Thank you for flagging this article!

Beyond that, the Infobox alignment doesn't seem relevant, my understanding is that there are already sophisticated heuristics built into the Content Translation tool.

Wikipedia, Translation and the Collaborative Production of Spatial Knowledge(s): A socio-narrative analysis (2018)

Citing sources in translation is an interesting twist, and definitely carries some of the colonial baggage we've already started to unpack. It requires multilingual editors, and we can expect it to somehow entangle former colonists and their influence sphere.

Do you wish to share any thoughts about how this entanglement might play out? For example, which sources are more respected in either language, which sources are available, who is multilingual and how, ...

English speaking sources and most importantly, those with very high educational qualification would probably be the most respected. Their level of education would make them seem more prestigious and knowledgeable. This might really be true but we can't deny that there is an inherent bias and opinion formed just on your level of education alone, and then if you factor in the type of school attended, it adds another dynamic, especially since the at the top of the list of most prestigious universities in the world are these English speaking countries,

Also, can i edit these article links into my contribution? The links to the full articles would most definitely be helpful.

Thank you for the review, I really appreciate it.

@1992
Great summaries, thank you!

Content Translation: Computer-assisted translation tool for Wikipedia articles (2015)

Just to make the connection for others, this is a paper specifically about the Content Translation software we're analyzing here, written by some of its authors. The paper was published in the same year the software was released, so it's mostly an overview of the software design process and user research involved.

Specifically relevant to our questions for this project, the authors analyze the 900 translations made through the tool and say,

We found that English is the most used source language, consistent with Hale’s findings on multilingual user behaviour (2013).

Here's the full citation in case someone feels like following up: Hale, Scott A. 2013. Multilinguals and wikipedia editing. CoRR, abs/1312.0976. (full text)

Wikipedia as a Translation Zone: A heterotopic analysis of the online encyclopedia and its collaborative volunteer translator community

I get sad reading about "difficult processes of fierce conflict and debate which often characterise interactions within such communities" and "Wikipedia can clearly be seen as a platform in which volunteer translators compete at least as much as they co-operate". But this seems to be the reality.

Would it be possible to test a hypothesis about how some wiki communities are more challenging to contribute to than others? Maybe through survey questions, or by looking at translator activity over time and how they might switch from writing for one wiki to writing for another?

@awight
Thank you so much for the commendation and clarifications. Your recommendation is a possible way to solve the conflict.

Actually the "difficult processes of fierce conflict and debate which often characterise interactions within such communities" is a saddening one.
But the questions you highlighted can be very promising to checkmate the problem.

"Would it be possible to test a hypothesis about how some wiki communities are more challenging to contribute to than others?

Your contribution "Maybe through survey questions, or by looking at translator activity over time and how they might switch from writing for one wiki to writing for another" will really go a long way to alleviate the difficult processes of fierce conflict and debate encountered in translation community.

Hey @awight @Simulo I have made my contribution for this task by writing summary of 3 articles your feedback will be appreciated.
Analysing the use and perception of Wikipedia in the professional context of translation
Link: https://www.jostrans.org/issue23/art_alonso.pdf
Author: Elisa Alonso
Suggested by: Akansha Jaiswal

Summary:
This paper is based on an online survey conducted among translation professionals, aiming to explore how professionals conduct their work, the needs they experience, and the tools and resources they resort to when translating. Specifically, the survey looks at how participants use Wikipedia and analyses their perceptions of this tool. The survey results suggest that respondents made extensive use of all sorts of technologies when translating, and they resorted to human resources to meet their needs. Respondents had a good overall opinion of Wikipedia and most of them reported using it when translating. However, some results suggest the existence of some kind of controversy or censorship with regard to the use of Wikipedia in professional contexts. The majority of respondents access Wikipedia through search page results, and they use Wikipedia for a variety of purposes, including documentation, terminology, and visual aspects. Respondents also noted the cultural dimension of Wikipedia and its usefulness in finding the meaning of cultural references. The survey also reveals that translation professionals use Wikipedia as consumers but do not always admit it, highlighting the online encyclopaedia's cultural and visual potential.
The survey discussed in this paper explores the tools and resources used by translation professionals and their perception and use of Wikipedia. While the survey has some limitations, such as language/country bias and a limited scope, it has qualitative value in documenting a little-studied reality. Respondents' most popular tools/resources include Google, online dictionaries/corpora, terminology databases, image-search engines, and human-driven resources. Translation memories (TMs) are not seen as a pressing need, and machine translation has low acceptance, possibly because of the nature of the sample, mostly freelance translators. Most respondents reported accessing Wikipedia from search engine results, not through a pre-planned strategy. Respondents use Wikipedia to solve their immediate needs related to documentation and terminological, lexicographical, and visual aspects. They perceive Wikipedia as useful, reliable, and easy to use, but they compare its information with other sources.

WIKIPEDIA-BASED ACTIVITIES AND TRANSLATION COMPETENCE DEVELOPMENT
Author: Małgorzata Kodura
Link to full article
Suggested by: Akansha Jaiswal
Summary:
This paper discusses the changing attitude of academics towards Wikipedia as a source of knowledge, from being considered a low-quality source and a synonym for plagiarism to being recognized as a valuable teaching tool. The focus of this paper is on the use of Wikipedia as a tool to support translation training at the university level and to help students develop a broad range of translation competences, based on the 2017 EMT competence framework requirements. Although there are objections to using Wikipedia due to its reputation for unreliability and subjectivity, it can also be approached as a concept for inclusive participation, incorporating the idea of online sharing and collaboration. Despite the initial negative attitude of academia towards Wikipedia, its presence in academic research is growing, and it is the most popular reference source on the web, with more than 87% of students reporting its use. This paper explores the advantages of using Wikipedia as a teaching tool for translation students. This also discusses various models of translation competence, including the PACTE research group model, the EMT "Wheel of Competence" model, and the model of translation competence acquisition to find how Wikipedia-based assignments can contribute to the development of translator competence in various areas, including language and culture, translation, technology, personal and interpersonal competence, and service provision. The Paper concludes by discussing the potential benefits of incorporating Wikipedia-based activities into translation training programs.
This paper mentioned how Wikipedia-based translation activities can help students develop and master digital literacy skills. It highlights how working on Wikipedia translations requires students to use and rapidly adapt to new tools and learn Wiki markup code and selected HTML elements. The students are required to correctly structure their articles with the use of Wikicode and equip them with valid links to references. They have to work in an unfamiliar environment and master elements of programming. This paper also includes how working on Wikipedia translations can help students develop personal and interpersonal competencies, such as collaboration online, following online etiquette, and participating in a large-scale project. It also discusses how Wikipedia can be used as a resource for translation trainers to provide their students with genuine translation tasks, allowing them to practice and develop their translation competence. The platform offers multiple modes of working, including individual work, online collaboration, and group activities, making it a versatile platform for translation practice. However, trainers need to be up-to-date with the latest changes in the platform and ensure that their students have access to the appropriate equipment and a stable Internet connection. Despite these challenges, the use of Wikipedia as a translation practice tool can be a highly valuable option for translation trainers at the university level.

Wikipedia Culture Gap: Quantifying Content Imbalances Across 40 Language Editions
Link to access the full paper -https://www.frontiersin.org/articles/10.3389/fphy.2018.00054/full
Author - Marc Miquel-Ribe and David Laniado
Suggested by : Akansha Jaiswal

Summary:

The article "Wikipedia Culture Gap: Quantifying Content Imbalances Across 40 Language Editions" analyzes the content of 40 language editions of Wikipedia, investigating the imbalance of articles related to cultural topics. The authors developed a metric called the Culture Gap Index to quantify the differences in coverage of cultural topics across different language editions of Wikipedia.
The study found that there is a significant imbalance in the representation of cultural content across different language editions of Wikipedia. To quantify this imbalance, the authors use data from Wikidata to identify articles that are present in one language edition of Wikipedia but not in others. They then use a statistical method to identify articles that are overrepresented or underrepresented in a particular language edition, relative to the expected distribution of articles based on the size of its corresponding language community. The authors find that there is a significant variation in the coverage of different topics across language editions, with some topics being overrepresented in certain editions and underrepresented in others. They also find that smaller language editions tend to have a more imbalanced distribution of content, with fewer articles overall and a smaller proportion of high-quality articles.

Overall, the article highlights the importance of addressing the cultural and linguistic imbalances on Wikipedia in order to promote more equitable access to knowledge for people around the world.

Translation_the_Wiki_way

Link: https://www.researchgate.net/publication/221367699_Translation_the_Wiki_way
Author: Alain Desilets, Lucas Gonzalez ,Sébastien Paquet and
Marta Stojanovic

Suggested by: Ella Jessica
Summary:
The article "Translating_ the_ wiki way" discusses the challenges and solutions for creating and maintaining multilingual wiki content. A wiki is a website where users can collaboratively create and modify content using their web browser, and this concept has revolutionized collaborative authoring on the web, leading to the creation of the world's largest online encyclopedia, Wikipedia. However, many of the largest and most high-profile wiki sites require content in multiple languages, and current wiki engines do not efficiently support the creation and maintenance of such content.

The traditional approach to dealing with multilingualism in wiki sites is to create separate and independent sites for each language, but this leads to wasted effort as the same content must be researched, tracked, and written from scratch for each language. The authors of this paper investigate what features could be implemented in wiki engines to deal more effectively with multilingual content. They look at how multilingual content is currently managed in more traditional industrial contexts and show how this approach is not appropriate in a wiki world.

The paper also describes the results of a User-Centered Design exercise that was performed to explore what a multilingual wiki engine should look like from the point of view of its various end users. The authors describe a partial implementation of those requirements in their own wiki engine, LizzyWiki, to deal with the special case of bilingual sites. They also discuss how this simple implementation could be extended to provide even more sophisticated features, particularly to support the general case of a site with more than two languages.

Finally, the paper argues that translating in this "Wiki Way" may also be useful in some traditional industrial settings, as a way of dealing better with the fast and ever-changing nature of our modern internet world. The paper's categories and subject descriptors include user interfaces, hypertext/hypermedia, and group and organization interfaces, while the general terms include design, languages, and human factors. Keywords used in the paper include multilingual wiki, translation workflow, collaborative web-authoring, user-centered design, groupware, and hypertext.

2

Discussion about Translation in Wikipedia

Link: https://www.researchgate.net/publication/254012465_Discussion_about_Translation_in_Wikipedia
Author: Toru Ishida and Ari Hautasaari
Suggested by: Ella Jessica
This research analyzed the discussion pages of the Finnish, French, and Japanese Wikipedias in order to understand the nature of communication and collaboration related to translation activities on Wikipedia. The study found that community interaction in Wikipedia translation primarily focuses on solving problems related to source referencing, proper nouns, and transliteration in articles, rather than simply translating words and sentences mechanically. This study is significant because it sheds light on the types of problems that require interaction with the community in order to improve the accuracy and quality of translated content on Wikipedia. Based on these findings, the authors propose future directions for supporting translation activities on Wikipedia.

How can I record the contribution done here. Can I please get a feedback on the same ? @awight
@Simulo

Hello @awight, @Simulo
I have made a little contribution and would greatly appreciate a feedback to improve.

Lost in Translation : Context, Computing, Disputing on Wikipedia
Authors : Pasko Bilic and Luka Bulian
Link : https://www.ideals.illinois.edu/items/47320
Summary by : Priyanshi Goel

The authors begin by noting that while Wikipedia is often praised for its ability to provide multilingual information on a vast range of topics, the reality is that the quality and accuracy of information can vary greatly between different language versions. In particular, the authors note that translation errors and misunderstandings can occur when information is translated between languages, which can lead to disputes and disagreements between editors.
To explore this issue in more detail, the authors conducted a detailed analysis of the Croatian language version of Wikipedia, focusing on the ways in which editors collaborate and communicate with each other in order to create and maintain articles. They found that while the collaborative process on Wikipedia is generally effective, there are certain challenges that arise when working with multiple languages.
One of the main challenges identified by the authors is the difficulty of translating certain concepts and terms from one language to another. They note that even seemingly straightforward translations can be complicated by differences in cultural and historical contexts, as well as by differences in the ways in which words and phrases are used in different languages.
To address these challenges, the authors propose a number of strategies for improving the quality of multilingual communication on Wikipedia. These include the development of more sophisticated translation tools and technologies, as well as the creation of more opportunities for cross-cultural and cross-linguistic dialogue and collaboration among editors.

Overall, "Lost in Translation: Contexts, Computing, Disputing on Wikipedia" provides a valuable contribution to our understanding of the complexities of multilingual communication on Wikipedia, and hence is relevant in understanding the imbalances we’re seeing in translation on the site. It also offers important insights into how these challenges can be addressed in order to improve the accuracy and reliability of information available on the platform.

Growing Wikipedia Across Languages via Recommendation
Authors : E Wulczyn, R West, L Zia, J Leskovec
Link : https://dl.acm.org/doi/abs/10.1145/2872427.2883077
Summary By : Priyanshi Goel

The paper discusses a method to encourage the growth of Wikipedia articles in different languages through the use of recommendation algorithms. The authors note that while Wikipedia has grown to become a vast repository of knowledge in many languages, there are significant disparities in the quantity and quality of articles across different language editions. To address this problem, the authors propose a system for recommending articles that are likely to be of interest to editors of a particular language edition, based on patterns of article creation and editing in other language editions.
The authors analyze data on article creation and editing in multiple language editions of Wikipedia, and develop a recommendation algorithm that takes into account various features of articles, such as their length, quality, and popularity. They then evaluate the performance of the algorithm by using it to recommend articles to editors of various language editions of Wikipedia, and measuring the resulting increase in the number of articles created or improved.
The results of the study show that the recommendation algorithm is effective in increasing the number of articles created or improved in many language editions of Wikipedia. The authors note that the system could be used as a tool for supporting the growth of Wikipedia in smaller languages, which may lack the resources or expertise to create and maintain large numbers of articles on their own. They also note that the system could be used to help address biases and gaps in Wikipedia content, by identifying topics that are underrepresented in certain language editions and recommending articles on those topics to editors. Overall, the paper presents a promising approach to supporting the growth and diversity of Wikipedia content across different languages. These results and conclusions can be helpful for us further conduct research in this direction.

Hi! Please consider resolving this task and moving any pending items to a new task, as GSoC/Outreachy rounds are now over, and this workboard will soon be archived.

As Outreachy Round 26 has concluded, closing this microtask. Feel free to reopen it for any pending matters.