Page MenuHomePhabricator

Explore a workflow for translating section titles
Closed, ResolvedPublic

Description

As part of the work of the Research team, they need to collect translations of section names. Based on T194467, Content Translation users will be invited to participate in the research.

The design goals are:

  • Clarity. Users need to understand what to do, and why are they asked to do so.
  • Simplicity. This should be a simple data gathering process to avoid scaring participants.
  • Encouragement. We want users to encourage users to provide many translations.

Some of the ideas are illustrated in the following process:

  • Short introduction and instructions, with an option for accessing more, provides context and a path for getting more details if needed.
  • The user is proposed a section to translate. The "view example" link shows an article with such section to provide context.

  • Autocompletion helps to surface possible translations that match the user input. The "Suggestions" label is included to emphasise that those are not choices to select from.

  • Once there is content for the user translation, it is possible to add an alternative translation or submit it. These options were disabled initially.

  • When the user completes a translation, a congratulation message shows the total translations completed. A different emoji can be shown each time as a way for triggering the user curiosity to make another translation.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 23 2018, 10:51 AM

@Pginer-WMF would you be able to upload individual assets like icons, etc.

@diego, can you provide translations for messages in the mocks for all languages?

Thanks, both.

leila added a comment.May 23 2018, 4:50 PM

@bmansurov let's have the translations covered once texts are fully finalized (they are not, yet). I think it's safer for us to wait for you to create a first version and we test it, before spending more resources on translations.

@diego please investigate which option we'd like to go for translations: paid or volunteers. We had issues with the quality of paid translations the other time, so we have to change service provider, flag the issues to the service provider and have them fixed, or ask for help from volunteers. I really hope we can do this without taking volunteers' time, but you tell me what the best way to go with is. :)

@leila, @diego I've uploaded my work in progress to http://gapfinder-tools.wmflabs.org/section-alignment/. The app is by no means ready for production, or even for proper testing as there's still work todo on both the front-end and back-end. That said I thought you'd want to play around with the system as early as possible. Enjoy!

@bmansurov Thanks. On http://gapfinder-tools.wmflabs.org/section-alignment/mapping , would it be possible to allow the user to change source and destination languages right on top of the page where source and language are set up instead of a link in the bottom of the page back to the first page where they can choose source and destination?

This is looking really good. Looking forward to the changes. :)

@leila Yes! Check out today's changes. I think it's ready for testing. Please test and report any bugs. Thanks!

@bmansurov I tested and it looks really good. :) Thanks. A few suggestions on my end:

  • The text bar doesn't fully scale with window size. Check one example of window size where it's off. can you fix this?

  • Is there a way to pass the language pair via the web address? That would be handy when Trizek-WMF will point to the tool for specific language pair communities.
  • can you point to what data gets collected from the user?
  • in Firefox, on Android mobile: auto complete doesn't work unless I click space after entering the first letter of the word I'm typing which is unexpected. In Chrome, it works. It's good if we can fix this.

@diego:

  • let's fix the Arabic list and combine the section titles that are clearly the same. There are a few strategies we have that we can do that. (release should not be blocked on this, of course.)
  • I'll finalize the text and you can review on Wednesday when you're back, and then we need translation for it. ;)

@Trizek-WMF @Pginer-WMF: Feel free to test it and let us know if we missed something. No obligation, of course. :)

leila moved this task from Staged to Done (current quarter) on the Research board.
leila moved this task from Done (current quarter) to In Progress on the Research board.

OK, I'll fix the issues you reported. Your screenshot also looks different from mine (especially the language selection buttons). Can you hard-refresh the page. Maybe some resources are cached.

You can programmatically pass languages as such: http://gapfinder-tools.wmflabs.org/section-alignment/?s=en&d=ru

s is the source language, d is the destination. The possible languages are 'ar', 'en', 'es', 'fr', 'ja', and 'ru'. Note that souce and destination have to be different.

Re data, do you mean I should indicate it in the UI? Or are you interested in what's being collected. I'm saving user's input only. Are you concerned about anything specific that I maybe overlooking?

leila added a comment.May 25 2018, 9:00 PM

OK, I'll fix the issues you reported. Your screenshot also looks different from mine (especially the language selection buttons). Can you hard-refresh the page. Maybe some resources are cached.

hard refresh did it. thanks.

You can programmatically pass languages as such: http://gapfinder-tools.wmflabs.org/section-alignment/?s=en&d=ru

s is the source language, d is the destination. The possible languages are 'ar', 'en', 'es', 'fr', 'ja', and 'ru'. Note that souce and destination have to be different.

thanks.

Re data, do you mean I should indicate it in the UI? Or are you interested in what's being collected. I'm saving user's input only. Are you concerned about anything specific that I maybe overlooking?

Not in the UI, what's being collected in the back-end. We need some way of telling apart users or sessions, just in case bad data is entered in the system. For example, if we see suspicious translations, we will likely want to discard all data provided by the specific user/session. Last time we had an anonymous cookie in the browser. See footer.

I see. Right now, I'm not saving user's cookie with their answers. But this is something I can add in the next interation.

@leila I've fixed the issues you found. I've also made changes to record anonymous session tokens. Please test and let me know if any issues. You can also see responses we got by going to http://gapfinder-tools.wmflabs.org/admin/ (I'll let you know of the password) and clicking on User Inputs. There you can filter by session ID. Those are all questions for users. Some are filled by users, some are still waiting to be filled.

Trizek-WMF added a comment.EditedMay 28 2018, 2:25 PM

Aren't you afraid to have multiple new translations through the "add a translation" link, even some that don't exist? I understand your case, while I've managed to force a new translation that were not listed in your samples (while translating "see also", suggestion was "voir aussi", but that can be translated as "voir également"). But is it really it okay?

Also, please avoid the camel-casing in the title: "WikipediA". Just Wikipedia is fine. :)

@Trizek-WMF I think we want to gather translations that don't exist. @leila and @diego may want to add more.

Re camel casing, that's the standard it seems. The mock-ups in the task description have it. The mobile site has it also: https://en.m.wikipedia.org/wiki/Main_Page

@Trizek-WMF I think we want to gather translations that don't exist. @leila and @diego may want to add more.

OK.
So there is also no need to close some combinations where you'll be sure to have a maximum of results. I was thinking about that earlier.

Re camel casing, that's the standard it seems. The mock-ups in the task description have it. The mobile site has it also: https://en.m.wikipedia.org/wiki/Main_Page

Well, IMO, that's a bit ugly but I let you decide. :)

@Trizek-WMF can you elaborate on the following?

So there is also no need to close some combinations where you'll be sure to have a maximum of results. I was thinking about that earlier.

@Trizek-WMF can you elaborate on the following?

So there is also no need to close some combinations where you'll be sure to have a maximum of results. I was thinking about that earlier.

The tool allows for instance en ↔️ fr pairs. That's an easy to do pair. You will get a lot of pairs translated. Have you considered to lock that pair at some point to make it unavailable?

leila added a comment.May 29 2018, 5:18 PM

@Trizek-WMF I think we want to gather translations that don't exist. @leila and @diego may want to add more.

@Trizek-WMF correct. since we're not providing context (and that's intentional), one section title in source may have multiple section titles in the destination language and we want to allow the user to provide multiple responses.

! In T195354#4240297, @Trizek-WMF wrote:
The tool allows for instance en ↔️ fr pairs. That's an easy to do pair. You will get a lot of pairs translated. Have you considered to lock that pair at some point to make it unavailable?

Oh I see. I haven't considered this, but once all sections are mapped for this pair, a message shows up telling users that they're all done. There are many improvements we can make to the project, and I hope I can do so in a prioritized way before we start getting labels. Thanks for bringing it up.

leila added a comment.May 29 2018, 7:52 PM

! In T195354#4240297, @Trizek-WMF wrote:
The tool allows for instance en ↔️ fr pairs. That's an easy to do pair. You will get a lot of pairs translated. Have you considered to lock that pair at some point to make it unavailable?

Oh I see. I haven't considered this, but once all sections are mapped for this pair, a message shows up telling users that they're all done. There are many improvements we can make to the project, and I hope I can do so in a prioritized way before we start getting labels. Thanks for bringing it up.

The current messaging system you have suffices for this purpose. We should aim to have the tool up for en-as-source by Thursday, and for all other source languages as soon as the translations come in. I will finalize the en text today and will let you know.

@leila I've updated the code and made these changes:

  1. In the admin area you can filter user input by status (done, not done) and by user session ID.
  2. In the admin area you can select user inputs and download them by selecting the "Download selected user inputs" from the action drop down.
  3. old inputs were not being saved as UTF-8, so when downloading you may see some gibberish, but new inputs don't suffer from this.
leila added a comment.May 31 2018, 5:33 AM

@Pginer-WMF We will be ready with the en as source language version in the next 24 hours or less, and the other source options will need translations that we will start soon (we expect those to be ready by early next week). We're basically ready for the banner in Content Translation. When do you think your team can have it up?

@Pginer-WMF We will be ready with the en as source language version in the next 24 hours or less, and the other source options will need translations that we will start soon (we expect those to be ready by early next week). We're basically ready for the banner in Content Translation. When do you think your team can have it up?

I added T194467 to the current sprint, updated the description, and increased the priority. Depending on engineer availability and deployment schedules, I estimate it may be ready in 2-3 weeks.

Trizek-WMF added a comment.EditedMay 31 2018, 9:01 AM

@bmansurov Is the interface translated? If you only speak Spanish and Japanese, English may confuse users a bit. :)

leila added a comment.May 31 2018, 7:03 PM

@Pginer-WMF We will be ready with the en as source language version in the next 24 hours or less, and the other source options will need translations that we will start soon (we expect those to be ready by early next week). We're basically ready for the banner in Content Translation. When do you think your team can have it up?

I added T194467 to the current sprint, updated the description, and increased the priority. Depending on engineer availability and deployment schedules, I estimate it may be ready in 2-3 weeks.

@Pginer-WMF if you get any chance to do it earlier, it's even better. Thanks! :)

leila added a comment.May 31 2018, 7:07 PM

@bmansurov final comments per IRC and email conversations:

  • Please change the text to: "Translate these section titles as you would expect to find them on Wikipedia articles in your language. If, depending on context, a section title can have multiple translations in your Wikipedia language, provide all translations. Use suggestions by auto-complete feature or write your own translation. If you don't know a translation or a translation doesn't exist, skip. The more you translate the better suggestions we'll be able to provide to editors in the future."
  • Thanks for removing the section titles for which we have already collected translation.

@leila the above are done. Also, language names are also appearing in those languages themselves.

Oh, btw, I've imported all spreadsheet data into the database so that when you download data from the admin interface, you'll be able to get previous data too. You can tell that data apart from the new one by missing session ID.

Update: I've switched from sqlite3 to mysql as the database and I think the app is ready for production.

@Pginer-WMF if you get any chance to do it earlier, it's even better. Thanks! :)

The code changes are reviewed and merged into the codebase already. Unfortunately, deployments seem to be stopped for this week (T194467#4296360) which means users will have to wait one more week until they can see the change.

leila added a comment.Jun 18 2018, 4:56 PM

@Pginer-WMF excellent, and waiting until 2018-06-25 is fine at this stage given that it's unavoidable. :)

Vvjjkkii renamed this task from Explore a workflow for translating section titles to tgcaaaaaaa.Jul 1 2018, 1:08 AM
Vvjjkkii removed Pginer-WMF as the assignee of this task.
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
CommunityTechBot renamed this task from tgcaaaaaaa to Explore a workflow for translating section titles.Jul 2 2018, 3:56 PM
CommunityTechBot assigned this task to Pginer-WMF.
CommunityTechBot raised the priority of this task from High to Needs Triage.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added a subscriber: Aklapper.
DarTar closed this task as Resolved.Jul 28 2018, 2:10 AM
DarTar edited projects, added Research-Archive; removed Research.
DarTar moved this task from Default to Q4-FY18 on the Research-Archive board.Jul 28 2018, 2:14 AM