Page MenuHomePhabricator

Anshika_bhatt_20 (anshika)
User

Projects

User does not belong to any projects.

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Sunday

  • Clear sailing ahead.

User Details

User Since
Mar 6 2023, 4:48 PM (58 w, 3 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
Anshika bhatt 20 [ Global Accounts ]

Recent Activity

Apr 15 2023

Anshika_bhatt_20 added a comment to T331204: Produce flow diagrams illustrating translation imbalances.

Hello @awight I tried to create a scatterplot diagram using the CSV file provided by @awight to visualize the relationship between the language translation ratio and the number of Wikipedia articles.

translation ratio vs wiki article count.png (663×1 px, 94 KB)

Apr 15 2023, 4:38 PM · Outreachy (Round 26), Outreach-Programs-Projects

Apr 3 2023

Anshika_bhatt_20 renamed T333831: Outreachy Proposal: Anshika Bhatt from outreachy proposal: Anshika Bhatt to Outreachy Proposal: Anshika Bhatt.
Apr 3 2023, 2:17 PM · Outreachy (Round 26)
Anshika_bhatt_20 claimed T333831: Outreachy Proposal: Anshika Bhatt.
Apr 3 2023, 1:15 PM · Outreachy (Round 26)
Anshika_bhatt_20 created T333831: Outreachy Proposal: Anshika Bhatt.
Apr 3 2023, 1:12 PM · Outreachy (Round 26)

Mar 26 2023

Anshika_bhatt_20 added a comment to T331200: Ultralight systematic literature review.

@Anshika_bhatt_20

Analysis on multilingual discussion for Wikipedia translation (2011)

Full text link: http://naomi-yamashita.net/wp-content/uploads/2018/09/b2015ricp-15.pdf
This is one of those things that feels obvious after you've read about it for the first time :-). Discussion on the source wiki might be helpful for clarifying details before or during translation. Discussion on the target wiki might be helpful for finding the most common names or adapting style. But multilingual discussion is a huge lack, it seems!

Correction: The article being translated was "Glacier National Park", not "neuroscience".

Thank you for the correction. The authors present a case study of the translation of the featured article "Glacier National Park" from English to Japanese. They highlight the role of multilingual discussion in identifying and resolving translation issues, such as cultural and linguistic differences. For instance, the authors note that certain terms like "neuroscience" do not have exact equivalents in Japanese and that the Japanese language has multiple words for "brain" depending on the context.

Mar 26 2023, 5:00 PM · Internet-Archive, Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 updated the task description for T331200: Ultralight systematic literature review.
Mar 26 2023, 4:28 PM · Internet-Archive, Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 updated the task description for T331200: Ultralight systematic literature review.
Mar 26 2023, 3:44 PM · Internet-Archive, Outreachy (Round 26), Outreach-Programs-Projects

Mar 23 2023

Anshika_bhatt_20 added a comment to T331204: Produce flow diagrams illustrating translation imbalances.
Mar 23 2023, 12:18 PM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T331204: Produce flow diagrams illustrating translation imbalances.

Hello, @Simulo I have tried to create a Sankey diagram Using the public Content Translation data source as a reference, I wrote Python code to generate a Sankey diagram that displays the translation relationships between several languages, such as Spanish, Italian, French, Japanese, Korean, Vietnamese, Chinese, Portuguese, Arabic, and more. I refined and extended the illustration to highlight additional details and intriguing imbalances or balances in translation.

This is the link to the original live Sankey diagram http://127.0.0.1:5500/sankey_diagram.html

Is there something in the repo that I should run, which will serve on this port? Or maybe you're using a local Flask command? If you wish, the commands to run each script and serve the pages could be documented in the README. Also, if you want to experiment with Markdown syntax there is a way to render images in the repository inlined in the README (docs) so you could use that file to showcase and explain your work.

You might also be interested in building a Jupyter notebook which offers a nice environment for exploring visualizations using python and can be installed locally. You could import and run your existing scripts from a notebook.

also, I attempted to create a scatterplot diagram of the translation ratio against the Wikipedia article count for different languages. However, I was unable to find a convenient source of data for this purpose. As a result, I estimated the translation ratios based on my research. it is important to note that these are rough estimates, and the actual translation ratios may be different. Additionally, there is no way to verify the accuracy of these estimates as there is no official data available. I appreciate your input and look forward to your response. @awight

I don't know of a convenient data source for article count, either. It might exist but I haven't found it yet. My workaround would be to run two APIs, first the sitematrix listing of all wikis, then filter to just Wikipedias, and then run an additional siteinfo API call on each of those sites. It's kind of a pain so I've posted a CSV you can use, in this repo: https://gitlab.com/wmde/technical-wishes/wiki-article-counter

Mar 23 2023, 12:02 PM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T331204: Produce flow diagrams illustrating translation imbalances.

Hello, @Simulo I have tried to create a Sankey diagram Using the public Content Translation data source as a reference, I wrote Python code to generate a Sankey diagram that displays the translation relationships between several languages, such as Spanish, Italian, French, Japanese, Korean, Vietnamese, Chinese, Portuguese, Arabic, and more. I refined and extended the illustration to highlight additional details and intriguing imbalances or balances in translation.

This is the link to the original live Sankey diagram http://127.0.0.1:5500/sankey_diagram.html

Is there something in the repo that I should run, which will serve on this port? Or maybe you're using a local Flask command? If you wish, the commands to run each script and serve the pages could be documented in the README. Also, if you want to experiment with Markdown syntax there is a way to render images in the repository inlined in the README (docs) so you could use that file to showcase and explain your work.

You might also be interested in building a Jupyter notebook which offers a nice environment for exploring visualizations using python and can be installed locally. You could import and run your existing scripts from a notebook.

also, I attempted to create a scatterplot diagram of the translation ratio against the Wikipedia article count for different languages. However, I was unable to find a convenient source of data for this purpose. As a result, I estimated the translation ratios based on my research. it is important to note that these are rough estimates, and the actual translation ratios may be different. Additionally, there is no way to verify the accuracy of these estimates as there is no official data available. I appreciate your input and look forward to your response. @awight

I don't know of a convenient data source for article count, either. It might exist but I haven't found it yet. My workaround would be to run two APIs, first the sitematrix listing of all wikis, then filter to just Wikipedias, and then run an additional siteinfo API call on each of those sites. It's kind of a pain so I've posted a CSV you can use, in this repo: https://gitlab.com/wmde/technical-wishes/wiki-article-counter

Mar 23 2023, 12:00 PM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T332647: Compare config scraper output with config API.

Hello, @awight and @Simulo Here is my submission for this task Github repository. I have checked the accuracy of all the other CSV files mentioned in task #T331201 As @Abhishek02bhardwaj and @Ahn-nath mentioned that my CSV file was not getting 100% accuracy, so I went back and fixed the code. Here are my observations for the following CSV files:

  1. @Anshika_bhatt_20

CSV file compared: supported_pairs.csv
Accuracy percentage: 100%

  1. @Ahn-nath

CSV file compared: cx_server_parsed.csv
Accuracy percentage: 100%

  1. @JaisAkansha

CSV file compared: supported_language_pairs.csv
Accuracy percentage: 0.85%
Reason: Upon analyzing the results, it seems that there is a disparity in the accuracy due to the handler files not being
handled by @JaisAkansha 's code. This is resulting in mismatched results for files using Google.yaml and Yandex.yaml,
even for the ones where the source language is incorrect. I think it's important to address this issue in order to achieve consistent accuracy across all files.

  1. @Emile-Daisy

CSV file compared: supported_language_pairs.csv.
Accuracy percentage: 0.85%
Reason: After reviewing the code, I observed that it doesn't seem to consider the preferred engines or the mt-
defaults.wikipedia.yaml file. Additionally, it appears that non-standard YAML files, like Google and Yandex, are being
ignored. This could be the reason for the significant difference in total lines between the scraper output and the handler
files.

  1. @Abhishek02bhardwaj

CSV file compared: supported_pairs.csv
Accuracy percentage: 100%

  1. @LeilaKaltouma.

CSV file compared: langs.csv
Accuracy percentage: 100%

Mar 23 2023, 7:38 AM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 updated the task description for T331200: Ultralight systematic literature review.
Mar 23 2023, 2:11 AM · Internet-Archive, Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T331199: Read paper and make guesses about how it applies to translators.

True, "participation in Wikipedia is highly uneven" and this problem is reflected and magnified on many levels. There has been a running debate on the question since at least 2006 (notably beginning with an exchange between Aaron Swartz and Jimmy Wales, "Who writes Wikipedia?") but it's worth mentioning that this hasn't been resolved conclusively, as far as I know. This analysis gets into some of the challenges to proving that certain editors or demographics are contributing a given percentage of the content. Another related fact worth mentioning is that the ratio of "policy" and talk page editing to content editing has been increasing over time.

The lens provided by Hypothesis 1 feels like an important contribution to the discussion: translation skills are not the same as being multilingual, and a small group is going to be doing this work. The designers of the Content Translation tool have made intentional affordances to reduce the needed technical proficiency by including a visual editor and doing some automatic transformation of wikitext templates, but there are still rough edges which require translators to be knowledgeable of specific characteristics (style, citation norms, templates...) of both the source and target wikis.

As a side note, it's interesting what shape the selection for multilingualism takes—for example, there are 3x as many second-language speakers of English as first-language speakers. I would make a guess with no evidence, that many people will feel more comfortable writing in their first language and reading (translating from) their second language.

Translators will need to be aware of these issues and make changes in their translations.

I'm not sure about this, I would like to learn more—it seems like a question that could be answered by including in a survey to translators? My assumption would be that translators avoid making any factual corrections beyond replacing citations with same-language sources, but maybe this is an assumption I'm transferring from the outside world where a translator is expected to translate "accurately" which means even translating a mistake, in the extreme case. It would make sense if wikis work differently.

What you describe in Hypothesis 3 could be related to hegemonic culture. Generally there are large and two-directional culture gaps between different language wikis as can be seen in the Diversity Observatory (in other words, English is not a superset of all content), but if a few dominant languages cover content from a homogenous perspective then I can see the effect you’re describing causing this singularizing content to be replicated exponentially throughout wikis, due to a lack of alternatives.

Mar 23 2023, 1:27 AM · Outreachy (Round 26), Outreach-Programs-Projects

Mar 22 2023

Anshika_bhatt_20 added a comment to T332647: Compare config scraper output with config API.

Hello, @Anshika_bhatt_20 . I hope to bring some guidance with my understanding of the task so may at least move forward before the mentors respond.

Firstly, one of the DataFrames is missing a column called 'is_preferred_engine'. I'm not sure if I should remove this column from the other DataFrame or consider it as a difference between the two sources. Could you please clarify what would be the best approach in this case?

It depends. The instructions indicate that you need to adjust the data shape so that both datasets can be compared. This means that, depending on the direction you decided to take, dropping or adding columns may be necessary so that the comparison is on a middle ground. For example, if you transform from the JSON file, then you would need to drop the preferred engine column and add it as a key to the transformed CSV file so that it matches the JSON structure (the “defaults” key would be the equivalent of the preferred engine column). If you transform from the CSV file, then you would need to figure out a way to include the “preferred engines” column so that it matches the structure of the CSV file.

Secondly, the column names in the two DataFrames are slightly different. One DataFrame has column names with underscores (e.g. 'source_language', 'target_language', 'translation_engine') while the other has column names without underscores (e.g. 'source language', 'target language', 'translation engine'). I'm not sure if I should rename the columns, add or remove anything to change the structure, or if these differences should be considered errors. Could you please advise me on what would be the best way to handle this situation?

The task specifies that we “want to see whether the information is lost or changed by the config scraper.” Information conveys meaning, so, I would take it as it is the focus of the task.

The header row is not as relevant for data comparison, because, semantically (their meaning) is the same. So you can either normalize it so that it does not affect the comparison (rename it so that they are the same) or completely ignore/skip it if you know the header row is not relevant for data comparison. Another example is that some people may put “False” and some “false” or even “0” as a value of the preferred engine column. You may count it as a difference and mention it, but overall, besides that data representation, the rest of the values and the line can mean the same ---> "nl,en,Google,False" is semantically equivalent to "nl,en,Google,false", but not literally equal, so that is worth noting.

Also, I am new to programming, so I'm having some trouble understanding how to proceed with these issues. Any guidance or suggestions you could provide would be greatly appreciated. Thank you for your help!"

I would download the https://cxserver.wikimedia.org/v2/list/mt file and use a JSON viewer to explore the data structure. Then focus on adjusting that structure to a CSV version of my contribution scraper output or someone else's contribution. That way you can design or use a parser that makes the “JSON” file have the same structure and format as my CSV file. After you saw that they have the same structure/shape and column and row order, then you can start observing differences and asking questions. For example, is data missing, do they have the same number of rows, and are the preferred engines specified in the “defaults” key matching the preferred engine assignment of my language pairs? You can do it the other way around with the CSV file.

Mar 22 2023, 1:41 PM · Outreachy (Round 26), Outreach-Programs-Projects

Mar 21 2023

Anshika_bhatt_20 added a comment to T332647: Compare config scraper output with config API.

Hello, @awight I'm working on the task to compare data from two sources in our project, but I've encountered a couple of issues that I'm not sure how to handle.

Mar 21 2023, 8:34 AM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T332647: Compare config scraper output with config API.
Mar 21 2023, 12:33 AM · Outreachy (Round 26), Outreach-Programs-Projects

Mar 19 2023

Anshika_bhatt_20 added a comment to T331201: Extract cxserver configuration and export to CSV.

@awight Thank you so much for your time and guidance. I wanted to let you know that I made the changes you suggested in the code. I have also added the necessary tests to ensure that the code is working as expected. Could you please take a look and let me know if there are any further changes I should make? Thank you for your feedback, it was very helpful.

Mar 19 2023, 3:44 AM · CX-cxserver, Outreachy (Round 26), Outreach-Programs-Projects

Mar 18 2023

Anshika_bhatt_20 added a comment to T331207: Compose a short survey for Content Translation users.

Thank you @Simulo for giving us this valuable feedback. Based on your input, I made some changes to the survey questions, which I believe have improved the clarity and relevance of the survey.

Mar 18 2023, 4:20 AM · Outreachy (Round 26), Outreach-Programs-Projects

Mar 17 2023

Anshika_bhatt_20 added a comment to T331204: Produce flow diagrams illustrating translation imbalances.

can I make use of python to explore the data please??

Yes, python is no problem!

Mar 17 2023, 3:30 AM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T328597: [Outreachy round 26] Research into translation imbalances.

Hello Everyone!
I am Anju Maurya, an Outreachy applicant from India. I am glad to be a part of this awesome community. I would like to contribute to "Research imbalances in translation between languages on Wikipedia" project. Can I start here by attempting the microtasks mentioned above?

Mar 17 2023, 2:58 AM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T331201: Extract cxserver configuration and export to CSV.

Hello, @awight and @Simulo This is my submission for this task https://github.com/anshikabhatt/Extract-cxserver-configuration-and-export-to-CSV. Please have a look and give me your valuable feedback on this. Thanks in advance.

Mar 17 2023, 2:21 AM · CX-cxserver, Outreachy (Round 26), Outreach-Programs-Projects

Mar 16 2023

Anshika_bhatt_20 added a comment to T331202: Configuration evolution over time.

I hear that it wasn't obvious to everyone that I've commented in GitHub. Please see the comments linked from the commit history of your repositories, like in this screenshot:

image.png (350×492 px, 18 KB)

Mar 16 2023, 3:17 PM · Outreachy (Round 26), Outreach-Programs-Projects

Mar 13 2023

Anshika_bhatt_20 added a comment to T331204: Produce flow diagrams illustrating translation imbalances.

@Maryam_Gbemisola: You can use Power BI (or any other tool) to explore the data, for final results we would prefer if they are created with an open source software (Python, R, calc, Orange3…).

Mar 13 2023, 9:41 AM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T331202: Configuration evolution over time.

@awight @Simulo For this task, I have tried to create a tool to track the evolution of language support for machine translations (and other software features) over time. To do this, I created a new Git repository containing a CSV file with a simple data structure (city and temperature). I made several Git commits with changes to this data, including adding and removing rows.

Mar 13 2023, 5:54 AM · Outreachy (Round 26), Outreach-Programs-Projects

Mar 12 2023

Anshika_bhatt_20 added a comment to T331204: Produce flow diagrams illustrating translation imbalances.

Hello, @Simulo @awight I am trying to write code in python to create a scatterplot diagram of languages and I think there may be an issue with the API response format. I am having an error that the translations list might contain strings instead of dictionaries. Please provide guidance on how to resolve the issue. Thank you in advance.

Mar 12 2023, 10:24 PM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T331201: Extract cxserver configuration and export to CSV.

@Anshika_bhatt_20 Hey Anshika, do you mind taking a look at the repository and and the CSV file and sharing your views. I'll be really thankful.

Mar 12 2023, 5:39 PM · CX-cxserver, Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T331201: Extract cxserver configuration and export to CSV.

@Abhishek02bhardwaj I think to use the handler.js file to access the source and target languages, you would need to modify the code in the supported_pairs.py file to include the logic to parse the handler.js files. Please correct me if I am wrong.

@Anshika_bhatt_20 You mean the "first working test.py". Yes, in the parser code i have handled the two types of files separately:

  1. Files which are using the standard configuration (They have been dealt with and are included in the CSV file)
  2. Files which are not using the standard configuration and instead use the handler file "Transform.js" to deal with the format.

I am trying to figure out the logic to use the handler file.

Mar 12 2023, 8:15 AM · CX-cxserver, Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T331201: Extract cxserver configuration and export to CSV.

hey, @Abhishek02bhardwaj can you tell me about the issue in detail? where exactly you are facing issue?

@Anshika_bhatt_20 Hi, Anshika. Actually the parser is almost done the only thing left is to deal with the two YAML files which are not using the standard configuration and instead use a handler file "Transform.js".

Mar 12 2023, 8:08 AM · CX-cxserver, Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T331201: Extract cxserver configuration and export to CSV.

hey, @Abhishek02bhardwaj can you tell me about the issue in detail? where exactly you are facing issue?

@Anshika_bhatt_20 Hi, Anshika. Actually the parser is almost done the only thing left is to deal with the two YAML files which are not using the standard configuration and instead use a handler file "Transform.js".
"Transform.js" is a JavaScript file that exports a class called TransformLanguages using module.exports.
The first line of the file is "use strict"; which enables strict mode in JavaScript, providing better error handling and preventing certain types of mistakes.
The TransformLanguages class takes a configuration object as a parameter in its constructor. The configuration object has two properties: languages and notAsTarget.
The languages property is an array of language codes that will be used to create a matrix of languages. The notAsTarget property is an optional array of language codes that should not be included as target languages in the matrix.
The class has a getter method called languages which creates and returns a matrix of languages. The matrix is an object with keys as the language codes from the languages property and values as arrays of language codes that are not the same as the key and are not included in the notAsTarget property. The englishVariants array is used to exclude certain variants of English from being included as target languages for each other.
Finally, the TransformLanguages class is exported so that it can be used in other modules of a JavaScript application.
This is all of my understanding of the "Transform.js". I am trying to find a way to use this handler in the parser to get the target and source language for the Google.yaml and Yandex.yaml files.

Mar 12 2023, 7:57 AM · CX-cxserver, Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T328597: [Outreachy round 26] Research into translation imbalances.

Hello, I'm in the process of recording my work in progress contribution, but I'm not sure which URL to include in the submission form. I was wondering if anyone knows which link to use? I want to make sure that I'm providing the most relevant and useful URL. @awight @Simulo @srishakatux

Hi @Anshika_bhatt_20 , what task are you recording? If it is for the survey questions, I shared the link to my etherpad as that was the relevant link for that particular task but @awight @Simulo or any other intern should please draw my attention if that is not the relevant link.

I think for the survey questions task we have the relevant link but what about the other two tasks " Read paper and make guesses about how it applies to translators" for this task and for this one also " Ultralight systematic literature review" we don't have relevant link

@Anshika_bhatt_20. I think you can save your contribution in google doc or use any word application and drive it . Then you can generate a link that way. I this is helpful

Mar 12 2023, 5:15 AM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T331201: Extract cxserver configuration and export to CSV.

@Abhishek02bhardwaj I think to use the handler.js file to access the source and target languages, you would need to modify the code in the supported_pairs.py file to include the logic to parse the handler.js files. Please correct me if I am wrong.

Mar 12 2023, 3:05 AM · CX-cxserver, Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T331201: Extract cxserver configuration and export to CSV.

hey, @Abhishek02bhardwaj can you tell me about the issue in detail? where exactly you are facing issue?

Mar 12 2023, 2:58 AM · CX-cxserver, Outreachy (Round 26), Outreach-Programs-Projects

Mar 11 2023

Anshika_bhatt_20 added a comment to T328597: [Outreachy round 26] Research into translation imbalances.

Hello Everyone, I am Tosin from Nigeria, An Outreachy Applicant I am excited to be here and looking forward to making contributions and connecting.

Mar 11 2023, 7:59 PM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T328597: [Outreachy round 26] Research into translation imbalances.

Hello, I'm in the process of recording my work in progress contribution, but I'm not sure which URL to include in the submission form. I was wondering if anyone knows which link to use? I want to make sure that I'm providing the most relevant and useful URL. @awight @Simulo @srishakatux

Hi @Anshika_bhatt_20 , what task are you recording? If it is for the survey questions, I shared the link to my etherpad as that was the relevant link for that particular task but @awight @Simulo or any other intern should please draw my attention if that is not the relevant link.

Mar 11 2023, 5:28 PM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T328597: [Outreachy round 26] Research into translation imbalances.

Hello, I'm in the process of recording my work in progress contribution, but I'm not sure which URL to include in the submission form. I was wondering if anyone knows which link to use? I want to make sure that I'm providing the most relevant and useful URL. @awight @Simulo @srishakatux

Mar 11 2023, 1:46 AM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T331200: Ultralight systematic literature review.

Hello, @awight @Simulo I have made my contribution to this task. I would greatly appreciate any feedback or suggestions you may have. Please let me know if there are any areas where I could improve or if you have any further advice. Thank you

Mar 11 2023, 1:27 AM · Internet-Archive, Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 updated the task description for T331200: Ultralight systematic literature review.
Mar 11 2023, 1:12 AM · Internet-Archive, Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 updated the task description for T331200: Ultralight systematic literature review.
Mar 11 2023, 1:11 AM · Internet-Archive, Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 updated the task description for T331200: Ultralight systematic literature review.
Mar 11 2023, 1:01 AM · Internet-Archive, Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 updated the task description for T331200: Ultralight systematic literature review.
Mar 11 2023, 1:00 AM · Internet-Archive, Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 updated the task description for T331200: Ultralight systematic literature review.
Mar 11 2023, 12:57 AM · Internet-Archive, Outreachy (Round 26), Outreach-Programs-Projects

Mar 10 2023

Anshika_bhatt_20 added a comment to T331204: Produce flow diagrams illustrating translation imbalances.

Hello, @awight I have been working on this project and wanted to clarify some details.

Mar 10 2023, 8:21 PM · Outreachy (Round 26), Outreach-Programs-Projects

Mar 9 2023

Anshika_bhatt_20 added a comment to T328597: [Outreachy round 26] Research into translation imbalances.

Hi everyone. I'm Omotayo a UX/UI designer... Happy to be contributing to this project.

Mar 9 2023, 10:39 PM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T328597: [Outreachy round 26] Research into translation imbalances.

Hi,
My name is Lydia Edogiawerie, I am a UX designer and Data scientist. I would love to contribute on the project "Research imbalances in translation between languages on Wikipedia" for May 2023 Outreachy Internship.

Mar 9 2023, 10:35 PM · Outreachy (Round 26), Outreach-Programs-Projects

Mar 8 2023

Anshika_bhatt_20 added a comment to T331207: Compose a short survey for Content Translation users.

@awight @Simulo Here is the link to my survey for this task. Please have a look and give me your valuable feedback. I would greatly appreciate any constructive feedback you have.

Mar 8 2023, 11:22 PM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T328597: [Outreachy round 26] Research into translation imbalances.

Hi, thank you for reaching out at the program beginning! For chatting and meta-discussion about the project, please find me under my real name (Adam Wight) on Zulip.

If you want to engage with the subject matter already, we've listed some mini-tasks above for your consideration. Feel free to try out your ideas in any one of these areas! The tasks were written to be easily shared and can be accomplished through many valid approaches, so don't be put off if someone else has started commenting.

Mar 8 2023, 6:59 PM · Outreachy (Round 26), Outreach-Programs-Projects

Mar 7 2023

Anshika_bhatt_20 added a comment to T331199: Read paper and make guesses about how it applies to translators.

I wanted to reach out to you to ask if you would willing to provide me with some feedback and review on the task I worked on and give me your valuable suggestions. I am open to any suggestions you may have, and I am committed to using your feedback to make improvements. @awight @Simulo @srishakatux @Aklapper

Mar 7 2023, 8:14 PM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T331199: Read paper and make guesses about how it applies to translators.

Digital Division of labor and informational magnetism: Mapping participation in Wikipedia.
In their 2015 paper “Digital Divisions of labor and informational magnetism: mapping participation in Wikipedia,” Graham, Straumann, and hogan explore the ways in which participation in Wikipedia is shaped and highly influenced by digital divisions of labor and the informational magnetism of certain topics.
A digital division of labor refers to the unequal distribution of labor and skills required for digital work, particularly in the online environment. It can result in unequal access to and control over digital resources. Similarly “Informational magnetism” refers to the phenomenon where certain individuals or groups have a disproportionate amount of influence or power in online communities, such as Wikipedia. As it is referring to the ability of certain articles or topics to gain more attention and contribution than others leading to unequal distribution of labor and contributions in the online community.
Also, they use a combination of quantitative and qualitative methods to analyze the distribution of editors across language versions of Wikipedia.
The authors find that participation in Wikipedia is highly uneven, with a small group of highly active editors responsible for a disproportionate amount of content creation. This editor tends to concentrate their editing activity around certain topics. They also find that there is a significant degree of information magnetism, with certain articles attracting a great deal of attention and activity while others are largely neglected.
Based on the paper “Digital division of labor and informational magnetism: mapping participation in Wikipedia,”. We can expect to see certain patterns in a dataset of translations between different Wikipedias. There are several potential implications for translators.
• Hypothesis 1
Unequal distribution of contribution: As the study found in the paper that a small number of editors were responsible for the majority of articles and topics on Wikipedia, we might expect to see that certain language communities have a higher chance of participation in Wikipedia and are therefore more likely to contribute translations. This suggests that there is maybe a similar concentration of translation work among a small group of translators.
Informed guess: as we all know translation requires a good level of proficiency and understanding of both the source and targeted languages, it is highly possible that only a small group of individuals who are bilingual will be able to contribute to the translation of Wikipedia content, similarly to the small group of highly active editors based on what we know from the paper. We can guess translation requires proficiency in two languages, only a small group of people who know more than one language will be able to translate.
• Hypothesis 2
Informational magnetism: The distribution of contribution may be influenced by “Informational magnetism”. As the author said, “a small number of contributors to Wikipedia have a large impact on its content". The presence of informational magnetism may impact the accuracy and quality of translations. Because of that the result may be biased or inaccurate in articles. Translators will need to be aware of these issues and make changes in their translations. The contributions of high–profile editors were more likely to be adopted by other editors. It also means high–profile translators may have a disproportionate impact on the translation community, and their work may be more likely to be recognized and adopted by others. The article that has already been translated extensively between different Wikipedias are more likely to attract further translation activity.
Informed guess: if an article has already been translated extensively, it is likely more possible that it has received a lot of attention and is, therefore, more likely to be of high quality making it more attractive for potential translators.

Mar 7 2023, 7:56 PM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T331199: Read paper and make guesses about how it applies to translators.

@Simulo hello, can you please clarify where should i submit the summary for this task? any specific formate or method of submission.

Mar 7 2023, 4:30 PM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T331200: Ultralight systematic literature review.

@awight hello, can you please clarify where should I send the review? any specific formate or method of submission.

Mar 7 2023, 4:26 PM · Internet-Archive, Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T331199: Read paper and make guesses about how it applies to translators.
Mar 7 2023, 4:16 PM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T331199: Read paper and make guesses about how it applies to translators.

hello, please clarify where to write the summary .

Mar 7 2023, 4:13 PM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 claimed T331199: Read paper and make guesses about how it applies to translators.
Mar 7 2023, 3:13 PM · Outreachy (Round 26), Outreach-Programs-Projects
Anshika_bhatt_20 added a comment to T328597: [Outreachy round 26] Research into translation imbalances.

Hello everyone, My name is anshika and I am excited to be joining this Wikipedia project. I am eager to contribute to help further develop this project. I am looking forward to contributing to this project.

Mar 7 2023, 1:35 PM · Outreachy (Round 26), Outreach-Programs-Projects