Page MenuHomePhabricator

InTense: Amass suitable projects from available datasets
Open, NormalPublic

Description

We need to add more projects to http://intense.wmflabs.org/
An initial goal to close this task could be to add 100. That should allow making a list of blockers to make them 1000, then 10000, etc.

For simplicity, we must focus on projects which:

  1. use supported formats;
  2. follow ISO 639 codes in a way easily mappable to ours.

We will start looking where it's easiest, i.e. sites with many repositories and a dataset we can easily query to find suitable projects.

Currently excluded:

On their own:

  • openhub.net says they index 667k projects, but probably no interest in offering downloads; finds 34k pot files
  • ddtp.debian.net is stuck in ~2001, they call Pootle a «new internationalisation framework» and have no structured l10n format

Old datasets (2005–2010) are often mentioned in research.

Event Timeline

Nemo_bis created this task.Jan 11 2015, 5:39 PM
Nemo_bis raised the priority of this task from to Normal.
Nemo_bis updated the task description. (Show Details)
Nemo_bis added a project: translatewiki.net.
Nemo_bis updated the task description. (Show Details)
Nemo_bis set Security to None.
Nemo_bis added a subscriber: Nemo_bis.
Nemo_bis updated the task description. (Show Details)Jan 11 2015, 5:46 PM
Nemo_bis updated the task description. (Show Details)Jan 17 2015, 6:02 PM
Nemo_bis updated the task description. (Show Details)Jan 17 2015, 6:10 PM
Nemo_bis updated the task description. (Show Details)Jan 17 2015, 6:50 PM
Nemo_bis updated the task description. (Show Details)Jan 21 2015, 9:03 AM
Nemo_bis updated the task description. (Show Details)Jan 23 2015, 4:04 PM
Nemo_bis claimed this task.Jan 28 2015, 9:23 AM

Reina et al. 2013 made me take notice of Damned Lies. Probably thanks to it, https://git.gnome.org/browse/ seems in a rather tidy shape, though not 100 % consistent. I'm currently cloning 579 repos on nemobis@ttmserver-mediawiki01:~/gnome; I'll then identify all "po" directories and pot files to start with and mass add them to InTense.

When this is done I'll publish the "sneak preview" blog post, so it will be the last call for review of it. :)

Nemo_bis updated the task description. (Show Details)Jan 29 2015, 12:28 AM
Nemo_bis added a comment.EditedJan 29 2015, 11:21 AM

I have to figure out how to get or produce the pot file for GNOME projects. In the meanwhile I cloned all GNU git repos and added some.

In short I more or less did:

for repo in `cat gnu-repos`; do git clone git://git.savannah.gnu.org/$repo; done
rm -rf gettext/ gcl/ bash/ www-ja/ ocitysmap/ childsplay/ www-fr/
find -type d -name vendor -exec rm -rf {} +
find -type f -name '*.pot' > gnu-pot.txt
find -type f -name '*.po' | sed --regexp-extended 's/.+\/([^./]+).po/\1/g' | sort -u > languages
# manual cleanup of the languages

So I made the template P236 and by replacing the pattern ^./([^/]+)/(.*po)/([^/]+).pot$ I produced P237 which, with pagefromfile.py, should now add 36 more groups (python pwb.py scripts/pagefromfile.py -start:START -end:END -file:gnu-pot.txt -family:intense -lang:en -notitle).

To do after those:

./maposmatic/www/locale/django.pot
./scleaner/src/scleaner.pot
./gibbon/help/gibbon.pot
./freedink-data/dink/l10n/dink.pot
./bibledit-web/web/pot/bibledit.pot

I have to figure out how to get or produce the pot file for GNOME projects.

Running intltool-update --pot inside the /po subfolder is one option. It will create the file modulename.pot.

Nemo_bis updated the task description. (Show Details)Jan 29 2015, 6:11 PM

I have to figure out how to get or produce the pot file for GNOME projects.

Running intltool-update --pot inside the /po subfolder is one option. It will create the file modulename.pot.

Yep, that's what I did in the linked case, but the pot file wasn't recognised by Translate or something.

Nemo_bis updated the task description. (Show Details)Jan 29 2015, 6:49 PM
Nemo_bis updated the task description. (Show Details)Jan 29 2015, 7:28 PM
Nemo_bis updated the task description. (Show Details)Feb 9 2015, 4:36 PM
Nemo_bis updated the task description. (Show Details)Feb 19 2015, 10:40 AM
Qgil added a subscriber: Qgil.
Qgil added a comment.May 18 2015, 11:12 AM

It is time to promote Wikimedia-Hackathon-2015 activities in the program (training sessions and meetings) and main wiki page (hacking projects and other ongoing activities). Follow the instructions, please. If you have questions, about this message, ask here.

Nemo_bis updated the task description. (Show Details)Sep 5 2015, 4:38 PM
Nemo_bis updated the task description. (Show Details)Sep 5 2015, 5:18 PM
Qgil added a comment.Sep 15 2015, 7:54 AM

Do you think this project could be suitable for Possible-Tech-Projects and Outreachy-Round-11?

No. Among other things, we don't have time this autumn.

Nikerabbit moved this task from Backlog to InTense on the translatewiki.net board.Apr 5 2016, 3:01 PM