Page MenuHomePhabricator

Support getting community involved section translation
Closed, ResolvedPublic

Description

The goal is to get titles of sections translated by pairs, for 6 languages.

At the moment the following has been done:

  • listing users according to Babel boxes and use of CX
  • create a message inviting people to get messages translated, that message has been translated in the 6 target languages
  • send messages to the users

No positive replies have been received so far.

The goal is to get pairs translated, at least "unusual pairs" (defined as different languages families and different alphabets), at least 2, for research significance. Pairs are mutual, for instance: fr -> ja and ja -> fr. If two mutual pairs are translated at 50% (starting from the top), that's a minimum Research can work with. It is also possible to create new pairs, if there are volunteers to handle them: the data will be then extracted.

Ideally, the pairs should be translated by the end of the month.

CL support is requested to contact people who can get those translations done if possible.

Code: https://github.com/wikimedia/research-gapfindertools
Tool: http://gapfinder-tools.wmflabs.org/section-alignment
Translations: https://meta.wikimedia.org/wiki/Research:Expanding_Wikipedia_articles_across_languages/Tool_Translations (all done and integrated)

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Trizek-WMF updated the task description. (Show Details)May 18 2018, 6:40 PM

I've contacted the Embassies, which are places where people speaking multiple languages offer to help others. I've checked the document to find the most uncompleted pairs and left a message for those languages.

I've also sent a more global message to the translators mailing-list.

Hi @Trizek-WMF! Thanks for all the work the you have done in so little time.

With @leila we have been discussing about improving the interface for this task. With @bmansurov , we will be working on this during the next following days. In the meantime, we can ask people to collaborate on the synonym labeling (T184213).

For sure, if there are people willing to help with sections translation using the spreadsheet interface it's ok, but we will prefer to wait for the new interface for contacting more people. Makes sense to you?

leila added a comment.May 23 2018, 5:50 PM

@Trizek-WMF To add to Diego's point, it is important that we finalize the list of requirements for people you'd help us reach out to before you do further outreach. More specifically, we are strictly limited to working with Wikipedians for both of the labeling tasks (synonym and translation), not only those who know the pairs of languages. The reason for that is that we need encyclopedic translations and we can have a good confidence on such translations only if Wikipedians provide it to us.

@diego, makes sense to have a better interface and potentially wait, but I clearly remember you asked me to have it done by the end of the month, and the interface would take time to be created. That's why I've made those messages.

Also, can you track the evolution of the translations in the spreadsheet a better way than I do (aka manually)? I need to know which languages get enough responses and which ones need more efforts.

@leila, is the task you ask for support related to T191675: CL support for increasing contributor diversity experiments. I'm a little bit lost...

leila added a comment.May 23 2018, 6:30 PM

@leila, is the task you ask for support related to T191675: CL support for increasing contributor diversity experiments. I'm a little bit lost...

I just pinged you that we meet to discuss details to make sure both sides are on the same page. For now: no, that project is a separate one. The synonym mapping refers to the labeling needs captured at https://docs.google.com/spreadsheets/d/1SFCjUNPVsZSq74UduQxiIbuQnsixymbz6qeXLxXMq4k/edit#gid=752807413 Let's discuss first though to make sure we are clear about the needs before you spend more time on it. :)

diego added a comment.May 23 2018, 7:00 PM

@diego, makes sense to have a better interface and potentially wait, but I clearly remember you asked me to have it done by the end of the month, and the interface would take time to be created. That's why I've made those messages.

True. My mistake. We need get things done as soon as possible, and I was trying to find the fastest possible solution. But I should confirmed this with the rest of the team before. Sorry for the confusion.
About the messages outside the Wikipedia Community, my understanding of our conversation is that was clear that we are looking for people that are active Wikipedia editors, maybe I should be more precise about that.

Also, can you track the evolution of the translations in the spreadsheet a better way than I do (aka manually)? I need to know which languages get enough responses and which ones need more efforts.

The only way that I now to track this is using the Version History button in the File Menu.

@leila, is the task you ask for support related to T191675: CL support for increasing contributor diversity experiments. I'm a little bit lost...

@diego, about the messages outside the Wikipedia Community, I haven't posted those, because, as you say, we are looking for people that are active Wikipedia editors. However, some messages may be posted on social medias if they are for active Wikipedia editors.

leila moved this task from Staged to In Progress on the Research board.May 24 2018, 5:27 PM
leila added a comment.May 24 2018, 5:38 PM

For archive happiness (and please point out if I miss something, @diego @Trizek-WMF):

  • Trizek-WMF, diego, and I met today.
  • There was a misunderstanding on diego and my end re embassies. These are on wiki embassies ;) which is great.
  • diego will send Trizek-WMF the updated list of users to contact taking into account user who have done at least 10 edits (from the beginning of time?) in the destination language.
  • Trizek-WMF will wait until early next week when the app discussed at T194467 will be ready. (bmansurov will link to that task once ready.)
  • Until then, Trizek-WMF will look into synonym mapping task T184213 . Link at https://docs.google.com/spreadsheets/d/1SFCjUNPVsZSq74UduQxiIbuQnsixymbz6qeXLxXMq4k/edit#gid=752807413 For this, task, we ideally need 2-3 editors each doing one full sheet. For example, fr-r1 is already done by Trizek-WMF and fr-r2 and fr-r3 can be shared with two other frwiki editors. If finding people who do one full sheet is hard, we should discuss to allow partial work to be submitted. @Trizek-WMF Please ping us if that happens and we can discuss the best way to proceed. Given the speed at which you did the task, we hope that this is a much easier task that experienced editors can help with quickly.
diego added a comment.May 24 2018, 9:09 PM

@Trizek-WMF : I've updated the list of candidates.
For candidates coming from the Babel Template, I'm filtering out the ones with less than 11 revisions in each Wiki.
For the candidates coming from the CX, I'm filtering the ones with less than 11 translations. I'm also providing the number of translations that they have already done

You can find the new list here:
https://docs.google.com/spreadsheets/d/1RYIjcpRcAiwcFie_1Wq5g22HxjimM0NDhYRFdX4x1f4/edit?usp=sharing

Thank you both! Keep me posted when T194467 will be ready!

Concerning partial completion of a column, that's more your decision than mine: you will know better than me if there is more effort to make for a given language. If there is a particular effort needed, give me the pair, and I'll try to find people to achieve it.

leila added a comment.May 25 2018, 5:06 PM

Thank you both! Keep me posted when T194467 will be ready!

yup.

Concerning partial completion of a column, that's more your decision than mine: you will know better than me if there is more effort to make for a given language. If there is a particular effort needed, give me the pair, and I'll try to find people to achieve it.

Got it. Some of the pages are already assigned to some volunteers. Please start with the sheets that are not assigned at all. I will put a color green on them you can prioritize in that sheet itself.

I've contacted more volunteers, based on the coordination tab in the translations document.

leila added a comment.Jun 1 2018, 5:44 AM

@bmansurov can you confirm that your comment at T195354#4247521 means that @Trizek-WMF can start reaching out to editors who can translate from en to the other 5 languages?

@Trizek-WMF You can pass the source and destination to the url. Example for English to Russian translations is http://gapfinder-tools.wmflabs.org/section-alignment/?s=en&d=ru . And as soon as bmansurov confirms that the system is ready, please feel free to start reaching out to (en, *) possible translators. Thanks!

Thatʼs right. Please go ahead with the announcement. As you said, letʼs start with a small number of users.

Trizek-WMF added a comment.EditedJun 1 2018, 8:32 AM

I would indeed not go now for pairs that don't include English, since the interface is not translated for other languages.

I'll go for English, then.

Do you have a way to measure the completion of a given pair of languages? That would help to know where we need to focus the efforts.

Here are the stats:

langdonenot done
ar-*7024203
en-*21012894
es-*3802255
fr-*3212324
ja-*13794
ru-*7031722
diego added a comment.Jun 1 2018, 1:38 PM

Thanks for this stats @bmansurov

Please, could you also provide the details for each specific pair? (eg. es-ar)

Here you go:

language pairdonenot done
ar-en467514
ar-es0981
ar-fr206775
ar-ja2979
ar-ru27954
en-ar82917
en-es66933
en-fr852147
en-ja280719
en-ru801198
es-ar0527
es-en249278
es-fr131396
es-ja0527
es-ru0527
fr-ar38491
fr-en156373
fr-es55474
fr-ja20509
fr-ru52477
ja-ar0759
ja-en0759
ja-es0759
ja-fr1758
ja-ru0759
ru-ar2483
ru-en41768
ru-es0485
ru-fr284201
ru-ja0485

Great! Thank you @bmansurov!

Would it be possible to have the stats updated regularly? And have percentages? Thanks. :)

We can do it, but it involves setting things up. I'd rather generate these numbers when we need them, probably once/twice a week.

We can do it, but it involves setting things up. I'd rather generate these numbers when we need them, probably once/twice a week.

I understand. Works for me. :)

bmansurov updated the task description. (Show Details)Jun 4 2018, 2:30 PM

Fresh set of stats as of now:

leila added a comment.Jun 8 2018, 5:42 PM

@bmansurov thanks!

@diego I somehow had blanked that we didn't translate the messages to the other languages. Can you work with bmansurov to make that happen? (I guess we're talking about a translatewiki project? or some other way.) It would be great if we have the translations ready no later than Wednesday next week so Trizek can start pushing on other pairs, too.

Thank you for the update, @bmansurov! I'll ping more people for the English to other languages pairs on Monday. Can't wait to have the all-languages translations.

@leila, @diego we wanted to use translatewiki the last time around and it won't work for our purposes this time either. See discussion at https://translatewiki.net/wiki/Thread:Support/Request_to_create_a_new_project.

@Deigo I suggest to look into using translatable pages on Meta: https://meta.wikimedia.org/wiki/Translatability

leila added a comment.Jun 8 2018, 6:09 PM

@bmansurov I may be missing something. We are talking about translating the buttons and instructions in the tool on translatewiki and not the section titles themselves. Are you saying that doesn't work either?

Trizek-WMF added a comment.EditedJun 8 2018, 6:26 PM

If you can use Meta as a the translation system for that project, please create a page there and list the elements you want to be translated. Create a sub-page for your project, with only the words and sentences you need translations for. Wrap them with <translate></translate> tags. Then ask a Translation admin to mark the page for translation. You will then get separated elements (like that one from that page) you may use on the tool you've built.

I'll ping community members on Monday my morning to have the translations done.

@leila that could work, but potentially involves setting up many moving pieces. For instructions and a couple of buttons that would be an overkill. @Trizek-WMF's instructions seem like a good alternative.

leila added a comment.Jun 8 2018, 7:46 PM

works for me, @bmansurov . Who will do what Trizek says? :)

bmansurov added a comment.EditedJun 8 2018, 8:05 PM

I thought @diego as he's leading the project and the task is non-technical. I'll help out if he needs technical help. And I'll take care of integrating translations into the software.

leila added a comment.Jun 8 2018, 8:09 PM

got it.

@diego the floor is yours. ;)

Trizek-WMF added a comment.EditedJun 11 2018, 12:14 PM

@diego I can create that page very quickly if you want. I just need to know on which sub-page I have to.
If you create the page by yourself, I can mark it for translation and then ping translators; just tell me when it's ready.

Trizek-WMF added a comment.EditedJun 11 2018, 3:23 PM

I've created a page and marked it for translation: https://meta.wikimedia.org/wiki/Research:Expanding_Wikipedia_articles_across_languages/Tool_Translations. It took me more time to create the page than creating that message.

Now you have 14 elements to use. They all use the same type of link: https://meta.wikimedia.org/wiki/Translations:Research:Expanding_Wikipedia_articles_across_languages/Tool_Translations/[translation unit]/[language] where:

  • [translation unit] is the number of the translation you need to use
  • [language] is the code for th language (like for the wikis).

For instance, to have the translation of the <-- T:7 --> elements from the source page in French, you go to https://meta.wikimedia.org/wiki/Translations:Research:Expanding_Wikipedia_articles_across_languages/Tool_Translations/7/fr

Please be careful about translation unit #8 in Arabic: {N} translations completed! will be RTL in your integration. I've excluded the {N} from being translated so that you will lust have the message to use, @bmansurov

Concerning all translation, maybe some changes should be done to have a better understanding of what is asked. For instance, you use the word "section" to describe "section titles". A section on Wikipedia is the title and all sub-contents. I've tried to document some elements I've found problematic. You can review it or create documentation for some elements to translate here.

Translation is already done for French, btw.

Any updates about the source text? Please update the source if you want to make changes, but don't touch the <!-- T:1 --> stuff and <translate> tags. :)

When do you need to translations to be completed?

diego added a comment.Jun 12 2018, 5:56 PM

@Trizek-WMF : I have completed the translation to Spanish.

leila added a comment.Jun 12 2018, 6:27 PM

Any updates about the source text?

@bmansurov French and Spanish translations are ready at https://meta.wikimedia.org/wiki/Research:Expanding_Wikipedia_articles_across_languages/Tool_Translations . Can you update the tool with this information and let Trizek-WMF know when he can start reaching out to the pairs that start with es and fr?

When do you need to translations to be completed?

Translation of the instruction text? ASAP, please. :) Given that Diego has done es and you have done fr, hopefully you can get the other 3 in soon.

@bmansurov

Concerning the source, please read T195001#4272558. I've listed some points you should consider to change before I ask for more translations. That may impact already done translations.

@Trizek-WMF,

OK, removing {N} should be fine.

To get a better understanding of messages, you can look at the UI here: http://gapfinder-tools.wmflabs.org/section-alignment/

I can also create qqq messages later.

@bmansurov, if you select English as source and French as target, you will have the interface in English. It is actually easier to translate words and short section titles in a foreign language than understanding an explanation in a foreign language. To take one example, I would feel very much more confortable at translating section titles from English to French than the opposite. For that case I would appreciate to have explanations in French if I translate from English to French. Would it be possible to select the interface language instead?

When you click on "add an alternative translation", you don't have a focus on the new field. That's a detail, but that would make things more comfortable. :)

Trizek-WMF updated the task description. (Show Details)Jun 13 2018, 7:36 AM

Oh, and you haven't updated the source for translations. The English used is a bit complicated and don't give much context. I'll do my best to get the translations done without misinterpretation nor approximations.

diego added a comment.Jun 13 2018, 3:32 PM

@Trizek-WMF, I agree with you that instructions could be improved. I have myself needed to struggle a bit to translate to Spanish. But, we have already done many iterations to reach this quality of instructions, and although they are far from perfect, I don't thing that we should be blocked on this. So, please, let's just use this instructions.

bmansurov added a comment.EditedJun 13 2018, 4:14 PM

@Trizek-WMF I've made the UI language the destination language as you requested. I've also made it so that newly added input boxes get autofocused (you may need to clear your browser's cache to see this). I'm a little tight on time right now, so I put the UI language selection off to a later date (but I doubt I'll manage to get to that).

I've updated the translation messages with instructions for translations: https://github.com/wikimedia/research-gapfindertools/blob/master/sectionalignment/locale/fr/LC_MESSAGES/django.po . Messages that start with "#. Translators:" are instructions, aka qqq messages.

Trizek-WMF added a comment.EditedJun 13 2018, 4:32 PM

@Trizek-WMF, I agree with you that instructions could be improved. I have myself needed to struggle a bit to translate to Spanish. But, we have already done many iterations to reach this quality of instructions, and although they are far from perfect, I don't thing that we should be blocked on this. So, please, let's just use this instructions.

Ok, I'm sending requests for translations help now. :)

@Trizek-WMF I've made the UI language the destination language as you requested. I've also made it so that newly added input boxes get autofocused (you may need to clear your browser's cache to see this). I'm a little tight on time right now, so I put the UI language selection off to a later date (but I doubt I'll manage to get to that).

Thank you for fixing that! That's great. :)

Translations have been changes to add the {N} at the right place and translations updated accordingly. @bmansurov, I hope you will be able to integrate it.

I'm still looking for translators in ar, ru and ja.

bmansurov added a comment.EditedJun 14 2018, 3:52 PM

@Trizek-WMF translations have been updated. I've also updated the instructions for that translations message. Notably, N is always bigger than 4. And the formatting has changed from {N} to %(N).

Trizek-WMF added a comment.EditedJun 15 2018, 2:51 PM

@Trizek-WMF translations have been updated. I've also updated the instructions for that translations message. Notably, N is always bigger than 4. And the formatting has changed from {N} to %(N).

Great! That helps a lot.

Japanese and Russian are ready.

The Japanese and Russian translations are live.

Arabic done. We are all set. \o/

bmansurov added a comment.EditedJun 18 2018, 8:36 PM

FYI, I'm working on adding Arabic translations. This one's a little tricky becase besides adding translations, I need to make the UI RTL.

Arabic translations and RTL support is live!

Trizek-WMF updated the task description. (Show Details)Jun 19 2018, 9:08 AM

Stats as of today:

Trizek-WMF added a comment.EditedJun 28 2018, 10:36 AM

At the moment, pairs that have less than 100 translations are: ar-es, ar-ja, ar-ru, en-ar, en-es, es-ar, es-ja, es-ru, fr-ar, fr-es, fr-ja, fr-ru, ja-ar, ja-en, ja-es, ja-fr, ja-ru, ru-en, ru-es, ru-ja.

Some pairs that are apparently not existing in user stats we have, like ja-es, ar-ru... Those pairs will be very difficult to solve, I'm afraid.

I've sent reminders to users identified by @diego for the following pairs (there are on that document some users who are not active anymore). Between parentheses is the number of active users contacted.

  • fr-ar/ar-fr (9)
  • fr-ru (2)
  • fr-ja (1)
  • fr-es (2)
  • ja-fr (1)
  • en-ar (3, 2 shared with ar-fr)

It takes much more time that I have estimated because some people have already been contacted and others haven't. I had to adapt my messages to the circumstances, that's why I've apparently made a limited number of messages for now. I was focusing on French and English only; message is ready for Spanish speakers/translators (than you @diego for reviewing my draft), I'll post them soon.

Trizek-WMF added a comment.EditedJun 28 2018, 5:18 PM

I've updated Diego's documents with people I've contacted. That's easier than reporting pairs I've worked on in this document. :)

(I'm not putting the link here for privacy reasons.)

Vvjjkkii renamed this task from Support getting community involved section translation to mqcaaaaaaa.Jul 1 2018, 1:09 AM
Vvjjkkii removed Trizek-WMF as the assignee of this task.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
CommunityTechBot renamed this task from mqcaaaaaaa to Support getting community involved section translation.Jul 2 2018, 6:38 AM
CommunityTechBot assigned this task to Trizek-WMF.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added a subscriber: Aklapper.
Elitre awarded a token.Jul 3 2018, 2:45 PM

We currently wait for the fish to bite, while the tool is now displayed on CX and every possible user ho can help has been identified.

diego added a comment.Jul 5 2018, 10:04 AM

@Trizek-WMF , thanks for your efforts.
Can we push more for translators from/to Japanese? This is the main missing piece.

leila added a comment.Jul 5 2018, 8:36 PM

@diego @Trizek-WMF, updated numbers below:

I've left a new message on Japanese Wikipedia's Embassy (an international place where to contact English speakers of that wiki). This combined, with the other messages already left, is the maximum of effort possible for this wiki, where the community dynamics is complicated (the community is not as structured as it can be elsewhere).

leila added a comment.Jul 6 2018, 5:32 PM

@Trizek-WMF thanks for the update.

@diego we will likely have one realistic option with regards to Japanese: drop the language and replace it with another language, for example, Korean or Vietnamese. We can also try Mechanical Turk, and while we won't run into the structural challenges Trizek-WMF mentions in that environment, we still will be suffering from the small percentage of people who speak Japanese and *. We may be better off dropping the language and choosing another one where the community is more active in with regards to providing labels. Please let me know what you think.

diego added a comment.Jul 7 2018, 4:47 PM

I'm ok changing to another language. I don't have strong preferences between Korean or Vietnamese, I think we should select the one with more probabilities of getting good/fast translations.

I have easier contacts with Korean speakers.

If there any new moves about that task?

leila added a comment.Jul 18 2018, 4:38 PM

We discussed this today: it's best not to add a new language, yet, as we will run into similar issues with rare language pairs when we introduce another language. For example, it will most likely be hard to collect labels for Korean and Arabic. Diego is going to do one pass over the data to see how much we can infer from it. If we can't learn from it, we will push for more data again. @Trizek-WMF I think for now we can consider this task as done, and we can open a new one if there is a need in the future. Thanks for all your help! :)

Trizek-WMF closed this task as Resolved.Jul 24 2018, 1:42 PM

@Trizek-WMF I think for now we can consider this task as done, and we can open a new one if there is a need in the future. Thanks for all your help! :)

Great!

Elitre added a subscriber: Elitre.Oct 5 2018, 9:24 AM

Satisfaction survey reply received.