Page MenuHomePhabricator

Possible collaborations between Wikimedia and Apertium
Closed, ResolvedPublic

Description

hi GSoC admins,

Apertium didnt get into GSoC this year :-(
Therefore our mentors have lotsa spare time :-| .... I was thinking if there would be a possibility to do a joint GSoC project, related to your machine translation plugins, where Apertium is currently used ?

Some ideas:

  • are there some specific Apertium language pairs which works, but not good enough, which it would make sense to improve for wikimedia ?
  • some packaging and other distribution issues that are not solved - kart_ (Kartik Mistry) currently spends a lot of time doing it, and we'd love to make it more streamlined.

For language pairs... Implementing good ways to contribute from CX (that's the code name for ContentTranslation) back to Apertium is something that Wikimedia would love to have.
https://phabricator.wikimedia.org/T91492 is one simple thing - probably not something to take up a whole GSoC project, but a microtask for applicants. :)

(copying from IRC)
Finding a nice streamlined way to get the translators to contribute translations back to the current Apertium dictionaries would be a very cool thing.
https://phabricator.wikimedia.org/T91492 is about finding the missing words, but making them actually translated is the real value.
Looking into the parallel corpora and post-editing done via Content translation would be the maximum. :)
Just a simplistic thought, CX could show a box for every word that Apertium fails to translate, and the translator could fill it.
Or better, the software could just grab this word straight from the translation. Collecting the sentences that end-users make changes to would be nice, so that Apertium workers can see input/MT/correction easily.
The project could be: build the "inbox" for such reports. @KartikMistry's current OPW project is kinda something like an inbox for new words to add dictionaries, but it's for spelling dictionaries, rather than translation. But maybe it could be adapted.
(<TinoDidriksen>): It's not that complex - just collect and align. It's made much easier since you basically have sentence alignment given by the MT output. Users don't need to do anything. It should be automatic, based on what edits they make.

(proposed by jacobEo from #apertium. http://www.dtu.dk/Service/Telefonbog/Person?id=78778&tab=6 - feed free to edit!)

Event Timeline

Jn0101 raised the priority of this task from to Needs Triage.
Jn0101 updated the task description. (Show Details)
Jn0101 subscribed.
Jn0101 set Security to None.
Jn0101 updated the task description. (Show Details)
Amire80 renamed this task from Work closer with Apertium to get their stuff fit our needs to Possible collaborations between Wikimedia and Apertium.Mar 6 2015, 12:39 PM
Amire80 added a project: ContentTranslation.
Amire80 updated the task description. (Show Details)
Amire80 added subscribers: Niharika, Qgil.
Amire80 added a subscriber: KartikMistry.

I can probably co-mentor some of these things, although I receive a lot of mentorship requests and I'll have to pick carefully :)

I can guide people as needed, but not really have capacity to become co-mentor yet :)

Change 194856 had a related patch set uploaded (by Nemo bis):
Hide "prefershttps" preference on HSTS domains (ru): it has no effect

https://gerrit.wikimedia.org/r/194856

Amire80 triaged this task as Medium priority.Mar 12 2015, 9:01 AM

This is a message posted to all tasks under "Re-check in September 2015" at Possible-Tech-Projects. Outreachy-Round-11 is around the corner. If you want to propose this task as a featured project idea, we need a clear plan with community support, and two mentors willing to support it.

This is a message sent to all Possible-Tech-Projects. The new round of Wikimedia Individual Engagement Grants is open until 29 Sep. For the first time, technical projects are within scope, thanks to the feedback received at Wikimania 2015, before, and after (T105414). If someone is interested in obtaining funds to push this task, this might be a good way.

@Qgil - The Possible-Tech-Projects board may need a new column for "Featured in Grants" for such projects.

Qgil claimed this task.

This task was created for a collaboration in GSoC times that are past. Now it's time for Outreachy-Round-11, and if you are interested in converting any of these bullet points into project ideas, we are all for it.

So the resolution of this task is Yes, Apertium is encouraged to propose project ideas and bring mentors as long as the projects are beneficial for Wikimedia / MediaWiki.

There is no point in keeping this task open, because in itself it has nothing else actionable. New project ideas need to come with new tasks. See you around? :)