Page MenuHomePhabricator

Compose a short survey for Content Translation users
Closed, ResolvedPublic

Description

Draft a survey for Content Translation users, but do not send it out. Work in a new etherpad, using whatever language you are most comfortable writing.
The goal of the survey is to learn more about how this software is used, and how translation languages are chosen.

Please note: Several people can work on this Phabricator task - please do not claim / assign this task to yourself. Thanks!

  • Write a short introduction to the survey, explaining what data we want to collect.
  • Quantitative
    • first language
    • most common source and target language (Give examples of how to write this clearly, eg. "en -> es".)
  • Qualitative
    • biggest difficulties encountered translating
    • what should newcomers be aware of

NOTES (You do not need to read them all to get started, I just could not pin the post with them, so I pasted them here!)

  • General concept: It helps to know what you want to find out and particularly, what your assumptions are. If you know your assumptions, you can put them to a test. We broadly wrote "The goal of the survey is to learn more about how this software is used" but that is pretty vague, maybe actually a question for a qualitative study. How do you think people translate? Can you learn something about translations from published research that informs your assumptions? With assumptions and questions about them you can focus your survey questions and the survey design.
  • Introduction: This is very important as it is what participants can use to see if they want to participate or not. Give the purpose of the survey but avoid merely repeating questions you are going to ask. Check if the language is clear. In this case, it might be tempting to use e.g. "qualitative" and "quantitative" here, but are reseracher jargon that might be unfamiliar to the participants.
  • Questions:
    • Aim to make them both a) short and clear b) self-contained, so they can be understood without reading previous questions
    • Check if they make sense in the context of the tool. E.g. asking what people translate with the tool or if they used customer service might generally be good ideas, but they do not apply to the tool which is usually used to translate wikipedia articles and has no customer support in the conventional sense.
  • Answers: As important as the questions you ask are the answers you allow…
    • Age: People might not like to give their age in years for various reasons, so age ranges are a good practice. The need to be unambigous: 20-30, 30-40 are ambigous – what do I check when I am 30? Better: 20-29, 30-39 or 21-30, 31-40, …
    • Gender: "How to Do Better with Gender on Surveys: A Guide for HCI Researchers" is my go-to resource (They recommend a man/woman/non-binary/prefer to self define/prefer not to answer). However, this does not mean that you need to take this route; some people might also opt for an open text field (in which case you need to think about how to analyse that) or not ask gender at all.
    • Nationality/Country: Important to keep in mind that the state controlling the territorry they are on might be different than the nation they see themselves belonging to.
    • Language (proficiency): Not easy, because it is unclear what skill level is meant. You could also set a definition like "good enough to write a Wikipedia article in that language" (which is not a perfect criterion, but at least something). If it is important to you, you could ask for their proficiency level. If that makes sense depends on your research interest (you can use more powerful analysis methods with rank-able items like a proficiency scale rather than a binary competent/not-competent scale, but it is more difficult to answer, so the question if it is worth it)
    • Asking for frequencies: To better compare these, give some hints what you mean: "In the last month, how often did you…": Never/ 1-2 times/3-10 times/11-50 times/ more than 51 times (there are different scales you can use – look at some examples)
  • Reasons for translations, motivations, other qualitative questions: To ask this you should have some good hypothesis of what you can to do with the data. As merely descriptive data of how participants are like they are not very informative (they depend on how people interpret them, how desireable the answers are etc.). They can be useful if you have an hypothesis like "people who are motivated by increasing coverage in their native language edit smaller wikipedias" (which would not be surprising, but can be tested)

Here are some more resources to learn about survey design:

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Hello @awight and @Simulo, here is the link to my contribution for the content translation survey. I look forward to your feedback.
https://etherpad.wikimedia.org/p/Olamide_Oladipo_Outreachy_Contribution

Hello @awight and @Simulo , my contribution link for this microtask "survey for content translation users" is http://localhost:9001/p/Akansha_outreachy_contribution. I look forward for your suggestions and feedback . As according to my analysis to fill this survey a sample of population will use which comes under simple random sampling just because I add some nominal and interval relevant questions to keep survey short and informative that will be easy for analyzing the data .

Thank you for your answers, I will summarize my feedback here:

General concept: It helps to know what you want to find out and particularly, what your assumptions are. If you know your assumptions, you can put them to a test. We broadly wrote "The goal of the survey is to learn more about how this software is used" but that is pretty vague, maybe actually a question for a qualitative study. How do you think people translate? Can you learn something about translations from published research that informs your assumptions? With assumptions and questions about them you can focus your survey questions and the survey design.

Introduction: This is very important as it is what participants can use to see if they want to participate or not. Give the purpose of the survey but avoid merely repeating questions you are going to ask. Check if the language is clear. In this case, it might be tempting to use e.g. "qualitative" and "quantitative" here, but are reseracher jargon that might be unfamiliar to the participants.

Questions: Aim to make them both a) short and clear b) self-contained, so they can be understood without reading previous questions

Answers: As important as the questions you ask are the answers you allow…

Age: People might not like to give their age in years for various reasons, so age ranges are a good practice. The need to be unambigous: 20-30, 30-40 are ambigous – what do I check when I am 30? Better: 20-29, 30-39 or 21-30, 31-40

Gender: "How to Do Better with Gender on Surveys: A Guide for HCI Researchers" is my go-to resource (however, this does not mean that you need to take this route; some people might also opt for an open text field (in which case you need to think about how to analyse that) or not ask gender at all)

Nationality/Country: Important to keep in mind that the state controlling the territorry they are on might be different than the nation they see themselves belonging to.

Languages: Not easy, because it is unclear what skill level is meant. You could also set a definition like "good enough to write a Wikipedia article in that language" (which is not a perfect criterion, but at least something). If it is important to you, you could ask for their proficiency level. If that makes sense depends on your research interest (you can use more powerful analysis methods with rank-able items like a proficiency scale rather than a binary competent/not-competent scale, but it is more difficult to answer, so the question if it is worth it)

Reasons for translations, motivations etc.: To ask this you should have some good hypothesis of what you can to do with the data. As merely descriptive data of how participants are like they are not very informative (they depend on how people interpret them, how desireable the answers are etc.). They can be useful if you have an hypothesis like "people who are motivated by increasing coverage in their native language edit smaller wikipedias" (which would not be surprising, but can be tested)

Asking for frequencies: To better compare these, give some hints what you mean: "In the last month, how often did you…": Never/ 1-2 times/3-10 times/11-50 times/ more than 51 times (there are different scales you can use – look at some examples)

A very construction blueprint, thanks alot @Simulo

Hello everyone, I'm Treasure Okafor. An aspiring Data and Research Analyst. Happy to contribute to this project.

@JaisAkansha you linked to localhost, which is only avaliable on your own computer.

general Feedback:

  • Age: Please use brackets that go up to 70 or 80, not just "50 or older" (I used only the few first brackets in my example, simply because I did not want to write a very long list)
  • Name: I am curious: Why do so many surveys ask for a name? What do you want to do with it in your analysis?
  • Type of content to translate: Quite some surveys ask for "What type of online content do you translate?". The tool we analyse in this task is primarily used to tranlsate Wikipedia articles. You can ask what content people usually translate if you have a particular interest in that, but the tool in question does not aim to be used e.g. to translate social media posts.

If you just get started or review your survey draft, please also consider the newly added links in the task description:

Hello everyone, I'm Treasure Okafor. An aspiring Data and Research Analyst. Happy to contribute to this project.

You are welcome. Hope you have made your contribution.

@awight @Simulo
Here's the link to my contribution.
Please I would appreciate your review and feedback Thank you.

https://etherpad.wikimedia.org/p/Tee_Kafy

I have made necessary corrections🙏

general Feedback:

  • Age: Please use brackets that go up to 70 or 80, not just "50 or older" (I used only the few first brackets in my example, simply because I did not want to write a very long list)
  • Name: I am curious: Why do so many surveys ask for a name? What do you want to do with it in your analysis?
  • Type of content to translate: Quite some surveys ask for "What type of online content do you translate?". The tool we analyse in this task is primarily used to tranlsate Wikipedia articles. You can ask what content people usually translate if you have a particular interest in that, but the tool in question does not aim to be used e.g. to translate social media posts.

If you just get started or review your survey draft, please also consider the newly added links in the task description:

Thanks for the general feedback @Simulo. I've updated my survey accordingly. Looking forward to another feedback!
https://etherpad.wikimedia.org/p/r.cc0bec633fccda663ae326275df91fee

@awight and @Simulo. Kindly review my short survey on Content Translation . I have added the link as a contribution on the outreachy website. Thank you

https://etherpad.wikimedia.org/p/Maryam_Gbemisola

Here is a link to my survey, I'm open to corrections. @Simulo and @awight I await a review and feedback

https://etherpad.wikimedia.org/p/Precious

general Feedback:

  • Age: Please use brackets that go up to 70 or 80, not just "50 or older" (I used only the few first brackets in my example, simply because I did not want to write a very long list)
  • Name: I am curious: Why do so many surveys ask for a name? What do you want to do with it in your analysis?
  • Type of content to translate: Quite some surveys ask for "What type of online content do you translate?". The tool we analyse in this task is primarily used to tranlsate Wikipedia articles. You can ask what content people usually translate if you have a particular interest in that, but the tool in question does not aim to be used e.g. to translate social media posts.

If you just get started or review your survey draft, please also consider the newly added links in the task description:

@Simulo Thank you for the feedback. I have updated my survey form accordingly. Looking forward to your review.
Link to my survey form - https://etherpad.wikimedia.org/p/xGzVywcafj65F66Gea2n

Hello mentors! @Simulo, @awight
Having had the opportunity to explore the fields of UX and HCI, this project feels very welcoming! I am excited to contribute to it!
Here's the link to my contribution: https://etherpad.wikimedia.org/p/r.a3c94f150d87bdf6d54ab3d6edce6449
Awaiting your feedback!

General feedback:

  • Some questions ask for what content people translate with the tool. This usually will be Wikipedia articles. There might be some reasons to ask what other content people translate (if you have a good hypothesis on that) but that particular question appeared in several surveys without further justification. (But if you know why, tell me the reasons please)
  • Customer support: There is no customer support in the conventional sense for that tool. When you ask questions, please check what services exist and how they are named (People might have asked on the tools talkpage or on another talk page – this is probably the closest equivalent to a customer support)

Hey, @awight, @Simulo, Thank you for the feedback. I have made necessary corrections as required.

https://etherpad.wikimedia.org/p/Eberey

General feedback:

  • Some questions ask for what content people translate with the tool. This usually will be Wikipedia articles. There might be some reasons to ask what other content people translate (if you have a good hypothesis on that) but that particular question appeared in several surveys without further justification. (But if you know why, tell me the reasons please)
  • Customer support: There is no customer support in the conventional sense for that tool. When you ask questions, please check what services exist and how they are named (People might have asked on the tools talkpage or on another talk page – this is probably the closest equivalent to a customer support)

Thank you for your feedback @Simulo. I will update my survey accordingly.

@Simulo @awight Thank you for the feedback. Please see my updated responses. I look forward to any feedback. https://etherpad.wikimedia.org/p/9xZPgnnIBJlX3Vk6DgPQ

@Simulo @awight Here is the link for the survey I developed https://etherpad.wikimedia.org/p/survey . Your feedback and review is highly appreciated.

Hi everyone, my name is Victory Lelekumo, I'm a Data Scientist, I am very excited to be working on this project. @awight @Simulo I look forward to your guidance through the course of working on this project. Thank you

Hi everyone, my name is Victory Lelekumo, I'm a Data Scientist, I am very excited to be working on this project. @awight @Simulo I look forward to your guidance through the course of working on this project. Thank you

Hello and welcome Victory. Happy to have you here.
You can start making contributions by working on any of the tasks above
All the best👍🏽

Good day. I am done with creating the survey questions and would really appreciate your feedback and corrections. Thank you @awight @Simulo

https://etherpad.wikimedia.org/p/hjFVz4RHT38qJM-mvuLo

I've made some changes and would gladly appreciate any corrections.

https://etherpad.wikimedia.org/p/Keith

general Feedback:

  • Age: Please use brackets that go up to 70 or 80, not just "50 or older" (I used only the few first brackets in my example, simply because I did not want to write a very long list)
  • Name: I am curious: Why do so many surveys ask for a name? What do you want to do with it in your analysis?
  • Type of content to translate: Quite some surveys ask for "What type of online content do you translate?". The tool we analyse in this task is primarily used to tranlsate Wikipedia articles. You can ask what content people usually translate if you have a particular interest in that, but the tool in question does not aim to be used e.g. to translate social media posts.

If you just get started or review your survey draft, please also consider the newly added links in the task description:

@Simulo
How about outliers with age groups?

  • For example, in my survey, I specify 18 – 25, as the first group, but according to Wikipedia, there are no age restrictions for who can write and edit their articles. Just because an eight-year-old writing a good-quality article is less likely or rare does not mean that it is impossible. However, I would rather add a new option in the list that includes "18 or younger" than include age groups that may seem unlikely, but that is still possible. The same happens with outliers on the other extreme. For example, the current life expectancy in countries like the US is around 80 years old, which could make adding the range "100 - 115 years old" absurd, but it is possible, and an outlier may not be included in the list. As of now, I simply include the “other” option for this case.

On diversity-related questions
Also, on a similar note, even though I have considered the age, I have decided not to include fields like age, gender, and ethnic group because even though there may be a correlation with important aspects we do want to study (such as age being related to education level and therefore the ability for the person to speak more languages or the ethnic group and country of origin together with the access to resources and therefore the likelihood of them not having regular access to the internet), I feel it is much more valuable to ask questions that are more directly related to those answers. Such, as, if necessary and valuable for the assumptions made, the country of origin as well as the education level or access to internet connection or comfort with the tool based on the device and connectivity.

Lastly, do you always suggest that we always post our links for you to review? Even if I feel I have addressed previous feedback and may feel it is redundant.

Thanks for your time.

Dear @awight @Simulo, I am done creating the survey questions and I would be glad to get your feedback and corrections if any. Thank You. https://etherpad.wikimedia.org/p/Tamarakedein

Hi everyone, my name is Victory Lelekumo, I'm a Data Scientist, I am very excited to be working on this project. @awight @Simulo I look forward to your guidance through the course of working on this project. Thank you

Hello and welcome Victory. Happy to have you here.
You can start making contributions by working on any of the tasks above
All the best👍🏽

Thank you for the warm welcome, I appreciate.

  • For example, in my survey, I specify 18 – 25, as the first group, but according to Wikipedia, there are no age restrictions for who can write and edit their articles. Just because an eight-year-old writing a good-quality article is less likely or rare does not mean that it is impossible. However, I would rather add a new option in the list that includes "18 or younger" than include age groups that may seem unlikely, but that is still possible.

This brings up an important side question: data protection laws prevent collecting any personal data from young people, at a cut-off age that varies by country. Here's a a reference: https://commission.europa.eu/law/law-topic/data-protection/reform/rights-citizens/how-my-personal-data-protected/can-personal-data-about-children-be-collected_en . My understanding is that we shouldn't collect data from anyone under the age of 17.

This comment was removed by Eberey.

This brings up an important side question: data protection laws prevent collecting any personal data from young people, at a cut-off age that varies by country. Here's a a reference: https://commission.europa.eu/law/law-topic/data-protection/reform/rights-citizens/how-my-personal-data-protected/can-personal-data-about-children-be-collected_en . My understanding is that we shouldn't collect data from anyone under the age of 17.

According to the same article, “the age threshold for obtaining parental consent is established by each EU Member State and can be between 13 and 16 years”. Because it can start from 13, or even 12, there are workarounds for asking for their data, such as obtaining the parent's consent, if we truly consider that diversity and these cases, in particular, are relevant for our analysis. The article you used as a reference applies to the EU, I imagine that if we do intend to collect data from countries in South America (or other continents), which I believe we should, then more rules apply to the same survey or more variations of those rules. Moreover, this is not the only option. Let us keep in mind that this must also depend on the type of data collected as well as on the technique or way in which it is analyzed. As far as I understand, the data will not be studied individually, which is one of the reasons why asking for the person's name is unnecessary.

Nevertheless, I am more oriented to believe that it may not be relevant for the survey to collect these (age, gender, and ethnic group), as I asked in another paragraph, and I, personally decided to remove those. I was asking if you, as mentors, see this as appropriate or absolutely necessary. If so, then, a follow-up question for the collaboration for the selected applicant would be if outliers like minors or people older than the maximum average need to be included or not.

Thanks for your involvement and response.

Hello @Simulo and @awight, Thanks for the feedback. I have updated my survey accordingly. https://etherpad.wikimedia.org/p/leila_kaltouma

general Feedback:

  • Age: Please use brackets that go up to 70 or 80, not just "50 or older" (I used only the few first brackets in my example, simply because I did not want to write a very long list)
  • Name: I am curious: Why do so many surveys ask for a name? What do you want to do with it in your analysis?
  • Type of content to translate: Quite some surveys ask for "What type of online content do you translate?". The tool we analyse in this task is primarily used to tranlsate Wikipedia articles. You can ask what content people usually translate if you have a particular interest in that, but the tool in question does not aim to be used e.g. to translate social media posts.

If you just get started or review your survey draft, please also consider the newly added links in the task description:

Thankyou @Simulo for your general feedback I've updated survey accordingly https://etherpad.wikimedia.org/p/r.d2df6a5eade68d182cc6a3dbe0027529
and I am looking forward for you feedback.

Thank you @Simulo for giving us this valuable feedback. Based on your input, I made some changes to the survey questions, which I believe have improved the clarity and relevance of the survey.

I would greatly appreciate it if you could take a few minutes to review the updated survey at https://etherpad.wikimedia.org/p/Anshika#L3 and provide me with any additional feedback or corrections you may have. Thank you for your time and support.

Thank you for the feedback @Simulo @awight, very much appreciated.

I have updated my survey to reflect your feedback, please see updated work via this link: https://etherpad.wikimedia.org/p/PGEe1EPuPjQochJHlUSc

Thanks!

Dear @Simulo @awight . I have already make the survey for Translate Content Software on this links. It is really awesome to do. Thanks a lot

https://etherpad.wikimedia.org/p/Romy_Ardianto_Survey_for_Content_Translation_users

Hello @awight @Simulo, Here is a link to my contribution. I will appreciate your review and feedback. Thank you.

https://etherpad.wikimedia.org/p/Balogun_Elizabeth

Hello @awight @Simulo, Here is a link to my contribution. I will appreciate your review and feedback. Thank you.

https://etherpad.wikimedia.org/p/Balogun_Elizabeth

I've made some changes and would gladly appreciate any corrections.

https://etherpad.wikimedia.org/p/Keith

Thanks @jan for the feedback.
I have made the necessary corrections and here is the link
https://etherpad.wikimedia.org/p/Keith

Hello @Simulo @awight. I have made a survey using my native language (which is Hausa language). I will appreciate your review and feedback. The link is below:

https://etherpad.wikimedia.org/p/8MfaMKo2UZ2SaVkw9uQg

Thank you!

Hello, @Simulo.

I completed this contribution a few weeks ago but did not ask for feedback because I felt that the general feedback applied to my case and was enough to fix my issues/make improvements. Nevertheless, @srishakatux mentioned in a Zulip chat that the completion date corresponds to the date when my contribution was reviewed.

If the review process is still open, I would appreciate it if you could take a look and provide feedback. Thanks.

Contribution: https://etherpad.wikimedia.org/p/r.df3d6f2e35e02a3cfa8912b58abb6e36

Good day. I am done with creating the survey questions and would really appreciate your feedback and corrections. Thank you @awight @Simulo

https://etherpad.wikimedia.org/p/hjFVz4RHT38qJM-mvuLo

Hello, I would really love feedback for the survey I drafted so I cam make any relevant corrections. Thank you @awight @Simulo

@Theodorahmbedzi8

It is not okay for you to remove subscribers because they are likely other contributors who are interested in the task, just as you are. Please avoid that and simply add your contribution in a comment by tagging the right mentors.

Apologize for not being knowledgeable about thank you

Apologize for not being knowledgeable about thank you

Hello, @Theodorahmbedzi8

I see you are still struggling with contributions. Please feel free to send me a message via https://wikimedia.zulipchat.com/#narrow/pm-with/402795-Nathaly-Toledo, after creating your Zulip/Wikimedia account so that I can guide you better. I will be available tomorrow. It is better to move your queries to Zulip because these posts are better suited for contributions or asking for feedback on related tasks.

Hello @Simulo @awight, I have made a survey in Hausa language. Should I translate it to back to English?
I would really appreciate your reviews and feedback. This is the link below:

https://etherpad.wikimedia.org/p/8MfaMKo2UZ2SaVkw9uQg

@1_3_5_7_9, yes, when you translate it back to English (does not need to be perfect) I can give general feedback on the questions/answers.

Hi @Simulo @awight

I made a questionnaire in English Language, 3Sections and a total of 15 questions. Kindly review, below is the link

https://etherpad.wikimedia.org/p/XRErIjFa8JopBiGSw2Ot

This comment was removed by awight.

Thanks for looking into the nuances around age. I think your decision to remove the age question entirely was the right one for our problem domain--although we could be surprised by finding something in that data, at the moment there's no theoretical model for age being an important factor. For the remaining legal issues, we can add a standard disclaimer to the survey if necessary, like "don't fill this out if you are under age X". Anyway, my apologies for bringing up this distracting legal quirk.

Your survey is concise and focused, and I like that you considered the interface as well. Perhaps we could strike "country of origin", my understanding is that migration background can be quite complex, personal, and not easy to integrate in a study. Whatever we hope to learn there is probably also available from the questions about mother tongue and multilingualism.

I would be curious about machine translation in particular: do they use Content Translation's built-in machine translation, do they paste into an external service, what are their thoughts about the benefits and limitations...

@Simulo @awight would really love a review of my survey. Thank you

https://etherpad.wikimedia.org/p/hjFVz4RHT38qJM-mvuLo

Hi, I like how this survey is respectful of its audience, and a good length.

"what is your gender?" can be a free-form field, I've heard this is the current best practice.

I hesitate to ask what country a person is born in, do you think this is an important demographic feature, or could be an interesting factor to correlate with translations?

"how do you select what language to translate into?" should include a free-form option. All the qualitative questions look good.

Maybe we should ask about machine translation? Is it a factor in choosing languages to translate? Is it allowed by the interface and the target language wiki community? Are they pasting into an external service? (if so, which one?)

I don't know how to best ask this, but I'd also be curious about "do you translate mostly from language A -> B, or also in reverse B -> A, or between multiple languages?"

I've seen another good question asked, "do you make factual corrections while translating?"

@Simulo @awight would really love a review of my survey. Thank you

https://etherpad.wikimedia.org/p/hjFVz4RHT38qJM-mvuLo

Hi, I like how this survey is respectful of its audience, and a good length.

"what is your gender?" can be a free-form field, I've heard this is the current best practice.

I hesitate to ask what country a person is born in, do you think this is an important demographic feature, or could be an interesting factor to correlate with translations?

"how do you select what language to translate into?" should include a free-form option. All the qualitative questions look good.

Maybe we should ask about machine translation? Is it a factor in choosing languages to translate? Is it allowed by the interface and the target language wiki community? Are they pasting into an external service? (if so, which one?)

I don't know how to best ask this, but I'd also be curious about "do you translate mostly from language A -> B, or also in reverse B -> A, or between multiple languages?"

I've seen another good question asked, "do you make factual corrections while translating?"

Thank you so much for the insight you posted here and directly on the survey too. I will make the necessary changes you suggested now.

Thanks for looking into the nuances around age. I think your decision to remove the age question entirely was the right one for our problem domain--although we could be surprised by finding something in that data, at the moment there's no theoretical model for age being an important factor. For the remaining legal issues, we can add a standard disclaimer to the survey if necessary, like "don't fill this out if you are under age X". Anyway, my apologies for bringing up this distracting legal quirk.

Your survey is concise and focused, and I like that you considered the interface as well. Perhaps we could strike "country of origin", my understanding is that migration background can be quite complex, personal, and not easy to integrate in a study. Whatever we hope to learn there is probably also available from the questions about mother tongue and multilingualism.

I would be curious about machine translation in particular: do they use Content Translation's built-in machine translation, do they paste into an external service, what are their thoughts about the benefits and limitations...

Thanks for the feedback. These all make sense.

Hi! Please consider resolving this task and moving any pending items to a new task, as GSoC/Outreachy rounds are now over, and this workboard will soon be archived.

As Outreachy Round 26 has concluded, closing this microtask. Feel free to reopen it for any pending matters.