Page MenuHomePhabricator

Get rid of Cite formatting i18n messages that are not actually localized
Open, MediumPublic

Description

There are a number of i18n messages in the Cite extension that don't seem to be actually localized. See below. It seems it is better to eliminate these messages - I especially don't see a reason to let HTML structures be localized. In Parsoid land, we are relying on CSS to get different formatting and localization of citations. Eliminating unused customizations could simplify the CSS rules that need to be written and maintained in some cases.

[subbu@earth:~/work/wmf/core/extensions/Cite] git grep cite_references_link_one i18n
i18n/en.json:   "cite_references_link_one": "<li id=\"$1\"$4><span class=\"mw-cite-backlink\">[[#$2|↑]]</span> $3</li>",
i18n/qqq.json:  "cite_references_link_one": "{{notranslate}}\n\nParameters:\n* $1 - references key\n* $2 - ref key\n* $3 - reference text\n* $4 - optional CSS class for direction",
[subbu@earth:~/work/wmf/core/extensions/Cite] git grep cite_reference_link i18n
i18n/en.json:   "cite_reference_link": "<sup id=\"$1\" class=\"reference\">[[#$2|&#91;$3&#93;]]</sup>",
i18n/qqq.json:  "cite_reference_link": "{{notranslate}}\n\nParameters:\n* $1 - ref key\n* $2 - references key\n* $3 - link label",

Event Timeline

ssastry created this task.

@Jdforrester-WMF @Aaharoni-WMF @awight @thiemowmde @Izno Can any of you think of a reason why we shouldn't do this? If we agree this is ia good change, we might have to send out an email on wikitech-l / mediawiki-l for any 3rd parties that might be using these. This is coming up as part of the parsoid read views projects -- this is one of those last bit incompatibility things that I am trying to work through and address. /cc @cscott

I've heard that there are some wikis/users who prefer to have class names and the like in their local language because people don't like it otherwise. (Total hearsay on my part.) I don't think it's worth paying that set of people any mind, otherwise there would be many more complaints about the software....

I don't see any strong reason for any of these messages to be translated.

Yeah, this is a long-standing issue.

The Cite extension massively abuses the message localisation system to inject raw HTML fragments into the DOM. This is (a) bad but (b) vital to several of the ways the extension has been used for ~15 years now to support some languages. ~8 years ago the Editing and Parsoid teams as-they-were-then-called were working on replacing the need for this with CSS selectors (I believe that https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/Cite/+/refs/heads/master/modules/ext.cite.style.fa.css was the exemplar done for this) led by Marc.

This was announced but I think it got lost over the years.

I believe the next steps would be T156351 and then T156350, after which point this task could be "drop old message-based display configuration of Cite"?

Right, I am picking this up now and those 2 tasks are on my radar now. But, in this task, I am focusing on messages that aren't even localized by anyone here. But, based on what wrote here, I clearly missed the fact that these messages could have been localized with on-wiki versions. I should figure out if that is the case before we can remove this.

So, you are right, even for these seemingly unused messages, it might be better to get those other tasks done before getting to this.

Looks like cite_references_link_one has some localized version on several wikis mostly to pick the caret/arrow and to bold/italicize them, it appears and as we've already done for visual diff testing, those can easily be done with CSS. But, that apart, it is looking likely that we could potentially remove some of those messages now once we confirm there are no localized versions of those on wikis.

And, here are the full results of which messages have localized versions on wikis (thanks to Reedy for running the db query for me across all wikis on the cluster):

[subbu@earth:~/work/wmf/core/extensions/Cite] grep Cite i18n.messages.usage.log | sed 's/.*Cite/Cite/g;s/\/.*//g' | sort | uniq -c| sort -nr
     93 Cite_references_link_many_format
     57 Cite_references_link_many_format_backlink_labels
     54 Cite_references_link_one
     53 Cite_references_link_many
     15 Cite_reference_link
      5 Cite_references_link_many_sep
      4 Cite_references_link_prefix
      3 Cite_references_link_many_and
      2 Cite_references_link_suffix
      2 Cite_reference_link_suffix
      2 Cite_reference_link_prefix

It is irritating a little that there are a handful of wikis there that are customizing the link_suffix and link_prefix messages. Anyway, to be continued.

Looks like it is zhwiktionary which does most of the prefix / suffix customizations. kawikibooks and kawikiquote customize the Cite_references_link_prefix message. Not sure why they do this.

And, looks like kawikibooks and kawikiquote don't actually override Cite_references_link_prefix.

So, that leaves us with zhwiktionary that overrides the prefixes (not suffixes). Instead of cite_ref- and cite_note-, they use, _ref- and _note-.

We should figure out if we can just get rid of that localized message on zhwiktionary and we can get rid of these 5 cite messages right away. cite_reference_link_key_with_num, cite_reference_link_prefix, cite_reference_link_suffix, cite_references_link_prefix, cite_references_link_suffix

Change 892491 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/extensions/Cite@master] [WIP] Remove unused "HTML message" cite_references_no_link

https://gerrit.wikimedia.org/r/892491

Change 940163 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/extensions/Cite@master] No expensive transformations on prefix/suffix messages

https://gerrit.wikimedia.org/r/940163

Change 940163 merged by jenkins-bot:

[mediawiki/extensions/Cite@master] No expensive transformations on prefix/suffix messages

https://gerrit.wikimedia.org/r/940163

Change 892491 merged by jenkins-bot:

[mediawiki/extensions/Cite@master] Remove unused "HTML message" cite_references_no_link

https://gerrit.wikimedia.org/r/892491

Change 977756 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/extensions/Cite@master] Drop unused …_suffix and …_key_with_num messages

https://gerrit.wikimedia.org/r/977756

Change 977756 merged by jenkins-bot:

[mediawiki/extensions/Cite@master] Drop unused …_suffix and …_key_with_num messages

https://gerrit.wikimedia.org/r/977756

Change 987766 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/extensions/Cite@master] Drop unused cite_reference(s)_link_prefix messages

https://gerrit.wikimedia.org/r/987766

Change 987766 merged by jenkins-bot:

[mediawiki/extensions/Cite@master] Drop unused cite_reference(s)_link_prefix messages

https://gerrit.wikimedia.org/r/987766

Change 998778 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/extensions/Cite@master] [POC] Call Parser::recursiveTagParse as early as possible

https://gerrit.wikimedia.org/r/998778