Page MenuHomePhabricator

Add DiscordWikiBot to Translatewiki.net
Closed, ResolvedPublic

Description

DiscordWikiBot is a console application that powers WikiBot, a Discord bot used by a number of Wikimedia community servers on Discord. It currently has scattered translations in 9 languages supported by the Discord community server owners (see list). To make translation process easier both for myself and translators, I would like to add the project to Translatewiki.net.

Messages could live under Wikimedia namespace if possible (with prefix dwb- or discordwikibot-). I will probably prefer less frequent commits (every two weeks?). There’s a lot of really small messages, so maybe the threshold also should be higher (25-50%?). Deployments are usually happening whenever there’s an update, but I will probably do them once a month anyway for new translations.

I’ll add message documentation and translators page around the time when we will add the messages to Translatewiki.net.

  • Name: DiscordWikiBot
  • Logo: none yet
  • Repo: https://github.com/stjohann/DiscordWikiBot (MIT licence)
  • Description: Discord bot for Wikimedia projects and wiki sites
  • File format: Same as MediaWiki’s, except for differences in message syntax. Path to messages is ./DiscordWikiBot/i18n.

Localisation notes
Localisation files use SmartFormat library’s syntax. Plural support is available, but is done slightly differently ({0:singular|plural}). Gender support is available, but not used or supported in the bot.

Discord’s Markdown syntax displays line breaks immediately (\n), unlike MediaWiki’s syntax (\n\n). Discord (not the bot) has no RTL support (see feature request).

Possible problems
Existing Serbian localisation is filed under sr.json (Serbian), but possibly should be under sr-ec.json (Serbian Cyrillic).

There are some messages that could count as ‘lego messages’ (bullets/dashes), let me know if it is better to fix it.

Event Timeline

stjn created this task.Oct 25 2019, 7:48 PM
Restricted Application added subscribers: Petar.petkovic, Aklapper. · View Herald TranscriptOct 25 2019, 7:48 PM
stjn updated the task description. (Show Details)Oct 25 2019, 7:55 PM
stjn updated the task description. (Show Details)Oct 25 2019, 8:26 PM
stjn updated the task description. (Show Details)Oct 25 2019, 8:42 PM
Base added a subscriber: Base.Oct 25 2019, 11:49 PM

Some observations:

  • Should add mandatory insertable variable validator for {\d+}
  • Should add insertable for {msg:.+}
  • Should add plural form count validator. Needs a custom validator due to different syntax from all others
  • "yes-no": "{0:yes|no}", is problematic because if we want to enforce correct number of plural forms, this message would fail in languages with more or less forms. This ambiguous syntax is not good. Can they use separate messages?

Aside, why is the translatewiki.net deadline in the message on Thursday?

stjn added a comment.EditedOct 26 2019, 9:25 AM

"yes-no": "{0:yes|no}", is problematic because if we want to enforce correct number of plural forms, this message would fail in languages with more or less forms. This ambiguous syntax is not good. Can they use separate messages?

Sure, I’ll change it. But just FYI, that syntax can also mean a true-false statement, like here, not just a plural form. I guess it is problematic to use it due to the ambiguity.

Aside, why is the translatewiki.net deadline in the message on Thursday?

The logic behind this was that on Thursdays WMF deployments happen, which is how the most recent localisation gets displayed. I believe I’ve asked on IRC about this before, but wasn’t sure about how to proceed, so I left it at that.

Tuesday is when new translations will be picked up by WMF, so reviews should happen by Monday.

Apparently all but one of the i18n files have a Byte-order Mark in the beginning, which causes json_decode to silently(1) fail to parse them.

stjn added a comment.EditedOct 26 2019, 6:17 PM

Noted about Monday, thank you.

What you describe is, apparently, Visual Studio’s default way of saving UTF-8 files. The one file that is unaffected was probably the one that was submitted by a pull request on Github (hr.json). I can fix this later today if that is causing problems.

Change 546337 had a related patch set uploaded (by Nikerabbit; owner: Nikerabbit):
[mediawiki/extensions/Translate@master] SimpleFFS: Strip Byte-Order Mark (BOM)

https://gerrit.wikimedia.org/r/546337

Change 546353 had a related patch set uploaded (by Nikerabbit; owner: Nikerabbit):
[translatewiki@master] Add DiscordWikiBot

https://gerrit.wikimedia.org/r/546353

stjn added a comment.Oct 27 2019, 10:51 AM

@Nikerabbit: Should I rename sr.json to sr-ec.json or does it not matter? (Expecting to push all the fixes today.)

If you don't rename it, we need to add a codemap. But for consistency you should rename it.

stjn added a comment.Oct 27 2019, 11:56 AM

I’ve split yes-no message everywhere, re-saved files without BOM and renamed sr.json to sr-ec.json. (Commit, contains some unrelated changes)

Change 546353 merged by jenkins-bot:
[translatewiki@master] Add DiscordWikiBot

https://gerrit.wikimedia.org/r/546353

Please add infotmation to https://translatewiki.net/wiki/Translating:DiscordWikiBot

We'll add the plural form validator a bit later.

Nikerabbit triaged this task as Medium priority.Oct 28 2019, 8:39 AM
stjn added a comment.Oct 28 2019, 9:05 AM

Done, thank you. I will fill message documentation now, too.

abi_ added a subscriber: abi_.Oct 28 2019, 4:35 PM

Translations were exported from Translatewiki.net today to the repo. See commit - https://github.com/stjohann/DiscordWikiBot/commit/eda91b292b16479526f7848ff12a0c12e4392d01

Leaving this open since we still have to add the plural validators.

stjn awarded a token.Oct 28 2019, 5:21 PM

Change 547155 had a related patch set uploaded (by Nikerabbit; owner: Nikerabbit):
[translatewiki@master] Add plural validator for DiscordWikiBot

https://gerrit.wikimedia.org/r/547155

Change 546337 merged by jenkins-bot:
[mediawiki/extensions/Translate@master] SimpleFFS: Strip Byte-Order Mark (BOM)

https://gerrit.wikimedia.org/r/546337

stjn added a comment.Nov 5 2019, 4:28 PM

Question: To be honest, I wanted less frequent updates than the current rate of twice a week, since I don’t really get to use them at that rate and I am not sure if there are any re-users. Is it possible to throttle it somehow, perhaps by adjusting the translation completion level needed for export?

I'd imagine the number of updates goes down as the most active languages complete the translation, unless you add new messages of course. Isn't it better to have the work waiting on git so you can use it whenever you need, rather than having the work stay at translatewiki.net where it certainly isn't going to be used.

Change 547155 merged by jenkins-bot:
[translatewiki@master] Add plural validator for DiscordWikiBot

https://gerrit.wikimedia.org/r/547155

abi_ added a comment.Nov 7 2019, 2:51 PM

We've deployed the plural validator for SmartForm on translatewiki.net yesterday, and then enabled it for this project today.

Works well. See screenshot below.

abi_ closed this task as Resolved.Nov 7 2019, 2:51 PM
stjn added a comment.Nov 7 2019, 3:17 PM

I'd imagine the number of updates goes down as the most active languages complete the translation, unless you add new messages of course. Isn't it better to have the work waiting on git so you can use it whenever you need, rather than having the work stay at translatewiki.net where it certainly isn't going to be used.

I guess I can live with it.

We've deployed the plural validator for SmartForm on translatewiki.net yesterday, and then enabled it for this project today.

Tested for Russian, which I know:

Why 4 forms are required here? I certainly can’t come up with 4 forms myself, hmm, and in MediaWiki it is 3 forms for Russian, not 4.

It comes from CLDR:

Category	Resolved String	Minimal Pair Template
one	из 1 книги за 1 день	из {NUMBER}  книги за {NUMBER}  день
few	из 2 книг за 2 дня	из {NUMBER}  книг за {NUMBER}  дня
many	из 5 книг за 5 дней	из {NUMBER}  книг за {NUMBER}  дней
other	из 1,5 книги за 1,5 дня	из {NUMBER}  книги за {NUMBER}  дня

From http://cldr.unicode.org/index/cldr-spec/plural-rules

Russian in MediaWiki has some particular shortcuts if I remember correctly. But it also has 4 forms, though nobody uses it fourth from in practise I think. It's not possibly to derive min-required and max-possible forms from the CLDR data. We can make the validator less strict, but it means it won't catch issues with too few plural forms provided.

stjn added a comment.Nov 7 2019, 3:44 PM

Ah, so the fourth is 1.5 value, got it. Is it possible to set other as not important or can it be really useful in some languages? Not sure how to go around here, really. I guess as long as people can save the edit according to their common sense, it would be alright.

Couple of options here:

  • Degrade it from error to warning
  • Disable this validator for Russian for this project
stjn added a comment.Nov 7 2019, 4:48 PM

Well, Russian translations shouldn’t be made anyway since I usually do them myself. Just not sure whether that will add problems in other languages or not (for example, other Slavic languages).

stjn added a comment.Apr 13 2020, 12:38 PM

Couple of options here:

  • Degrade it from error to warning
  • Disable this validator for Russian for this project

This is a very belated question to ask in this task, but can we do the first option? I couldn’t save a (Russian) message without filling out the fourth plural form, which would not be used anywhere in the bot anyway, since I don’t have non-integers anywhere, so that is probably inconvenient for some translators into other languages. I would prefer ignoring non-integer plural forms somehow and keeping it as an error, but that is probably too much work.

MBH added a subscriber: MBH.May 6 2020, 10:13 AM