Page MenuHomePhabricator

Move as much grammar transformation code as possible from PHP and JS to separate generic data files
Open, LowPublic

Description

Core MediaWiki has code for simple grammar transformations, used with {{GRAMMAR}} in messages. This code has to be written (and tested) separately in PHP and JS (and also in jquery.i18n).

The logic is supposed to be the same for the backend and the frontend, so it makes sense to make as much of the code and the data as possible shared between the (programming) languages. For most (human) languages the idea is the same: if a word matches a pattern, transform it according to a rule.

@Nikerabbit and I (@Amire80) went over most of the current code that does this, and as far as we can see, it can be replaced with pairs of regular expression patterns and replacements.

The plan is more or less this:

  • Make sure that all the relevant unit tests are written.
  • Make the tests common to PHP and JS (T115218).
  • Find the patterns for each language, convert them from PHP and JS to regular expressions in JSON files, and delete and PHP and JS code.
  • Optional 1: Move these JSON files, the tests and the PHP and JS logic that processes them) from the core to a separate library.
  • Optional 2: Allow sites to provide custom grammar rules (and possibly move custom $wgGrammarForms from PHP arrays to a more data-based format, but this requires some thought).

Event Timeline

Amire80 created this task.Oct 11 2015, 6:01 PM
Amire80 raised the priority of this task from to Low.
Amire80 updated the task description. (Show Details)
Amire80 added subscribers: Amire80, Nikerabbit.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 11 2015, 6:01 PM
Amire80 claimed this task.Oct 11 2015, 6:01 PM
Amire80 set Security to None.

Change 241645 had a related patch set uploaded (by Amire80):
Make the code for grammar data processing common

https://gerrit.wikimedia.org/r/241645

Change 241499 had a related patch set uploaded (by Amire80):
Make grammar data loadable as an RL module and usable in JS

https://gerrit.wikimedia.org/r/241499

Change 245184 had a related patch set uploaded (by Amire80):
Move the Ukrainian grammar rules from PHP and JS to JSON

https://gerrit.wikimedia.org/r/245184

Change 241499 merged by jenkins-bot:
Make grammar data loadable as an RL module and usable in JS

https://gerrit.wikimedia.org/r/241499

Change 241645 merged by jenkins-bot:
Make the code for grammar data processing common

https://gerrit.wikimedia.org/r/241645

Change 245184 merged by jenkins-bot:
Move the Ukrainian grammar rules from PHP and JS to JSON

https://gerrit.wikimedia.org/r/245184

Amire80 moved this task from Untriaged to Grammar on the I18n board.Mar 1 2018, 4:17 PM