Page MenuHomePhabricator

In "fixInternalLinks", "normalizeRegex" lacks support for translated "Category" terms; issue with "linkPrefixRegex" when colon in page name
Open, Needs TriagePublic



I spotted some errors in the fixInternalLinks of the ext.translate.special.pagepreparation.js of the Translate extension, I am not familiar with regular expressions so I better ask here :

In the function "fixInternalLinks":

  • normalizeRegex lacks international support :
normalizeRegex = new RegExp( /\[\[(?!Category)([^|]*?)\]\]/gi );

It doesn't work if your categories are translated in your MediaWiki. In my case, they are "Catégorie".

I addressed that by hacking the javascript, and adding instead :

normalizeRegex = new RegExp( /\[\[(?!Category|Catégorie)([^|]*?)\]\]/gi );

This is not very nice, but since I only get to have two languages, I'm fine with it.
But how to address it the right way ?

  • linkPrefixRegex has a flaw if I have a colon in my page name. Here is the regex :
linkPrefixRegex = new RegExp( '\\[\\[((?:(?:special(?!:MyLanguage\\b)|' + nsString + '):)?[^:]*?)\\]\\]', 'gi' );

Input is :

[[Toto{{#translation:}}]] becomes
[[Toto{{#translation:}}|Toto{{#translation:}}]] instead of

This bug also occurs if I have a colon in the title of my page, it works fine as long as there are no colons in the input.

At the moment, I commented the code running the regex, and modified the input in my normalizeRegex.

You might not consider it as a bug, but it's a bit annoying for automatic page preparation.

I use latest 2016.08 MLEB with 1.27.1 Mediawiki.

Event Timeline

Tuxxic created this task.Oct 21 2016, 2:54 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 21 2016, 2:54 PM
Aklapper renamed this task from Regexp on PagePreparation Javascript ext.translate.special.pagepreparation.js to In "fixInternalLinks", "normalizeRegex" lacks support for translated "Category" terms; issue with "linkPrefixRegex" when colon in page name.Oct 22 2016, 9:56 PM

But how to address it the right way ?

The JavaScript should fetch the list of namespace aliases from the siteinfo API and concatenate them in the regex. IIRC there was a patch for this somewhere, which maybe was abandoned.