Possible collaborations between Wikimedia and Apertium
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Jn0101
	Mar 6 2015, 12:30 PM

Description

hi GSoC admins,

Apertium didnt get into GSoC this year :-(
Therefore our mentors have lotsa spare time :-| .... I was thinking if there would be a possibility to do a joint GSoC project, related to your machine translation plugins, where Apertium is currently used ?

Some ideas:

are there some specific Apertium language pairs which works, but not good enough, which it would make sense to improve for wikimedia ?

some packaging and other distribution issues that are not solved - kart_ (Kartik Mistry) currently spends a lot of time doing it, and we'd love to make it more streamlined.

Nemo_bis and Nikerabbit would like an API endpoint to use in stock Translate https://gerrit.wikimedia.org/r/#/c/188570/

There is a lot to do with dictionaries, especially if bridging Wiktionary or Apertium or others with something that Content translation can use https://www.mediawiki.org/wiki/Content_translation/FAQ#What_dictionaries_will_be_available.3F

There is already https://phabricator.wikimedia.org/T31229 , maybe that can be turned into something that benefits both MediaWiki and Apertium

Magnus Manske is interested in adding machine translation to his Wikidata tool described in http://magnusmanske.de/wordpress/?p=265

For language pairs... Implementing good ways to contribute from CX (that's the code name for ContentTranslation) back to Apertium is something that Wikimedia would love to have.
https://phabricator.wikimedia.org/T91492 is one simple thing - probably not something to take up a whole GSoC project, but a microtask for applicants. :)

(copying from IRC)
Finding a nice streamlined way to get the translators to contribute translations back to the current Apertium dictionaries would be a very cool thing.
https://phabricator.wikimedia.org/T91492 is about finding the missing words, but making them actually translated is the real value.
Looking into the parallel corpora and post-editing done via Content translation would be the maximum. :)
Just a simplistic thought, CX could show a box for every word that Apertium fails to translate, and the translator could fill it.
Or better, the software could just grab this word straight from the translation. Collecting the sentences that end-users make changes to would be nice, so that Apertium workers can see input/MT/correction easily.
The project could be: build the "inbox" for such reports. @KartikMistry's current OPW project is kinda something like an inbox for new words to add dictionaries, but it's for spelling dictionaries, rather than translation. But maybe it could be adapted.
(<TinoDidriksen>): It's not that complex - just collect and align. It's made much easier since you basically have sentence alignment given by the MT output. Users don't need to do anything. It should be automatic, based on what edits they make.

(proposed by jacobEo from #apertium. http://www.dtu.dk/Service/Telefonbog/Person?id=78778&tab=6 - feed free to edit!)

Related Objects

Mentioned In: T112620: Select #Possible-Tech-Projects ready for Outreachy Round 11
T96165: Learn from user corrections to avoid editing the same term again and again

Event Timeline

Jn0101 created this task.Mar 6 2015, 12:30 PM

Jn0101 raised the priority of this task from to Needs Triage.

Jn0101 updated the task description. (Show Details)

Jn0101 subscribed.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 6 2015, 12:30 PM

Jn0101 updated the task description. (Show Details)Mar 6 2015, 12:33 PM

Jn0101 set Security to None.

Jn0101 updated the task description. (Show Details)

Jn0101 updated the task description. (Show Details)Mar 6 2015, 12:35 PM

TinoDidriksen subscribed.Mar 6 2015, 12:37 PM

Amire80 renamed this task from Work closer with Apertium to get their stuff fit our needs to Possible collaborations between Wikimedia and Apertium.Mar 6 2015, 12:39 PM

Amire80 added a project: ContentTranslation.

Amire80 updated the task description. (Show Details)

Amire80 added subscribers: Niharika, Qgil.

Amire80 added a subscriber: KartikMistry.

I can probably co-mentor some of these things, although I receive a lot of mentorship requests and I'll have to pick carefully :)

Nemo_bis added a project: Possible-Tech-Projects.Mar 6 2015, 12:43 PM

I can guide people as needed, but not really have capacity to become co-mentor yet :)

Change 194856 had a related patch set uploaded (by Nemo bis):
Hide "prefershttps" preference on HSTS domains (ru): it has no effect

https://gerrit.wikimedia.org/r/194856

gerritbot added a project: Patch-For-Review.Mar 6 2015, 1:38 PM

Meh, sorry, ignore me.

Nemo_bis removed a project: Patch-For-Review.Mar 6 2015, 1:40 PM

Ankitashukla subscribed.Mar 8 2015, 5:46 AM

Unhammer subscribed.Mar 9 2015, 9:12 AM

santhosh subscribed.Mar 11 2015, 9:43 AM

Amire80 triaged this task as Medium priority.Mar 12 2015, 9:01 AM

Niharika moved this task from Backlog to Re-check in September 2015 on the Possible-Tech-Projects board.Mar 24 2015, 12:21 PM

Amire80 mentioned this in T96165: Learn from user corrections to avoid editing the same term again and again.May 4 2015, 4:50 PM

Ricordisamoa awarded a token.Jun 6 2015, 11:07 PM

Ricordisamoa subscribed.

• iecetcwcpggwqpgciazwvzpfjpwomjxn awarded a token.Jun 7 2015, 9:34 AM

• iecetcwcpggwqpgciazwvzpfjpwomjxn subscribed.

• Elitre subscribed.Jun 7 2015, 10:31 AM

• Purodha subscribed.Jun 7 2015, 12:43 PM

Amire80 moved this task from Needs Triage to Bugs on the ContentTranslation board.Jul 3 2015, 1:57 PM

Arnaugir awarded a token.Aug 16 2015, 6:23 PM

This is a message posted to all tasks under "Re-check in September 2015" at Possible-Tech-Projects. Outreachy-Round-11 is around the corner. If you want to propose this task as a featured project idea, we need a clear plan with community support, and two mentors willing to support it.

This is a message sent to all Possible-Tech-Projects. The new round of Wikimedia Individual Engagement Grants is open until 29 Sep. For the first time, technical projects are within scope, thanks to the feedback received at Wikimania 2015, before, and after (T105414). If someone is interested in obtaining funds to push this task, this might be a good way.

Already proposed such a project: https://meta.wikimedia.org/wiki/Grants:IEG/Pan-Scandinavian_Machine-assisted_Content_Translation :)

@Qgil - The Possible-Tech-Projects board may need a new column for "Featured in Grants" for such projects.

TasneemLo mentioned this in T112620: Select #Possible-Tech-Projects ready for Outreachy Round 11.Oct 5 2015, 4:44 AM

This task was created for a collaboration in GSoC times that are past. Now it's time for Outreachy-Round-11, and if you are interested in converting any of these bullet points into project ideas, we are all for it.

So the resolution of this task is Yes, Apertium is encouraged to propose project ideas and bring mentors as long as the projects are beneficial for Wikimedia / MediaWiki.

There is no point in keeping this task open, because in itself it has nothing else actionable. New project ideas need to come with new tasks. See you around? :)

Possible collaborations between Wikimedia and ApertiumClosed, ResolvedPublicActions

Description

Related Objects

Event Timeline

Possible collaborations between Wikimedia and Apertium
Closed, ResolvedPublic
Actions