Page MenuHomePhabricator

PagePreparation doesn't detect languages tag already in page
Closed, ResolvedPublic

Description

When marking a page for translation using Special:PagePreparation:

If it has the following as the format:

<noinclude><languages /></noinclude>
//my page content//

Then it will become:

<languages/>
<translate>
<noinclude><languages /></noinclude>
//my page content//
</translate>

It should become:

<noinclude><languages /></noinclude>
<translate>
//my page content//
</translate>

Impact:

  1. 2 Language tags will be shown as in https://meta.miraheze.org/w/index.php?title=Training_modules/Dealing_with_online_harassment/slides/follow-up-tracking-and-appeals&diff=prev&oldid=97982
  2. The first translate section will be <noinclude><languages /></noinclude> which is untranslateable
MediaWiki Version1.34.0 (f0cbfa7) 02:15, 14 February 2020
Extension:Translate Version2020-01-23 (fa4d922) 16:09, 24 January 2020

Event Timeline

Reedy renamed this task from TranslatePreparation doesn't detect languages tag already in page to PagePreparation doesn't detect languages tag already in page.Jul 25 2023, 1:33 PM
Reedy updated the task description. (Show Details)

The problem is that the regex doesn't look for any whitespace in the initial check...

	/**
	 * Add the <languages/> bar at the top of the page, if not present.
	 * Remove the old {{languages}} template, if present.
	 *
	 * @param {string} pageContent
	 * @return {string}
	 */
	function addLanguageBar( pageContent ) {
		if ( !/<languages\/>/gi.test( pageContent ) ) {
			pageContent = '<languages/>\n' + pageContent;
		}
		pageContent = pageContent.replace( /\{\{languages.*?\}\}/gi, '' );
		return pageContent;
	}

Change 941415 had a related patch set uploaded (by Reedy; author: Reedy):

[mediawiki/extensions/Translate@master] ext.translate.special.pagepreparation.js: Check for whitespace in <language/> element

https://gerrit.wikimedia.org/r/941415

I don't know whether we should be "standardising" <languages/> becoming <languages />; maybe we should...

Not quite sure what happens with the noincludes currently... Lets see.

Change 941415 merged by jenkins-bot:

[mediawiki/extensions/Translate@master] ext.translate.special.pagepreparation.js: Check for whitespace in <languages/> element

https://gerrit.wikimedia.org/r/941415

With the patch it results in

Screenshot 2023-07-25 at 19.04.30.png (214×2 px, 22 KB)

Change 941486 had a related patch set uploaded (by Reedy; author: Reedy):

[mediawiki/extensions/Translate@REL1_39] ext.translate.special.pagepreparation.js: Check for whitespace in <languages/> element

https://gerrit.wikimedia.org/r/941486

Change 941487 had a related patch set uploaded (by Reedy; author: Reedy):

[mediawiki/extensions/Translate@REL1_40] ext.translate.special.pagepreparation.js: Check for whitespace in <languages/> element

https://gerrit.wikimedia.org/r/941487

Change 941488 had a related patch set uploaded (by Reedy; author: Reedy):

[mediawiki/extensions/Translate@REL1_35] ext.translate.special.pagepreparation.js: Check for whitespace in <languages/> element

https://gerrit.wikimedia.org/r/941488

Change 941486 merged by jenkins-bot:

[mediawiki/extensions/Translate@REL1_39] ext.translate.special.pagepreparation.js: Check for whitespace in <languages/> element

https://gerrit.wikimedia.org/r/941486

Change 941487 merged by jenkins-bot:

[mediawiki/extensions/Translate@REL1_40] ext.translate.special.pagepreparation.js: Check for whitespace in <languages/> element

https://gerrit.wikimedia.org/r/941487

With the patch it results in

Screenshot 2023-07-25 at 19.04.30.png (214×2 px, 22 KB)

Not quite...

Screenshot 2023-07-25 at 19.18.16.png (200×2 px, 24 KB)

Why is it adding </translate> without <translate>...

Change 941488 merged by jenkins-bot:

[mediawiki/extensions/Translate@REL1_35] ext.translate.special.pagepreparation.js: Check for whitespace in <languages/> element

https://gerrit.wikimedia.org/r/941488

Screenshot 2023-07-25 at 19.18.16.png (200×2 px, 24 KB)

Why is it adding </translate> without <translate>...

	/**
	 * Add the <translate> and </translate> tags at the start and end of the page.
	 * The opening tag is added immediately after the <languages/> tag.
	 *
	 * @param {string} pageContent
	 * @return {string}
	 */
	function addTranslateTags( pageContent ) {
		pageContent = pageContent.replace( /(<languages ?\/>\n)/gi, '$1<translate>\n' );
		pageContent = pageContent + '\n</translate>';
		return pageContent;
	}

Because it's looking for the newline...

Not 100% clear what we should be doing here though, though a different bug to what was actually originally reported... Do we necessarily want to fix this?

Should we be looking for a newline? Swap the regex for /(<languages ?\/>\n?)/gi?

But then, in this case, for example, we end up with a translate tag inside a <noinclude>, something like:

<noinclude><languages /><translate></noinclude>
Text here
</translate>

Which is not right either.

If we can't add the <translate>, should we be adding the </translate> at the end? I suspect it won't save in either version anyway, so it's not helping the user at all.

Or should we skip it, and let the rest just do what it needs to do? Users could then manually add leading/trailing tags as appropriate.

Is this a common enough use case?

Certainly, this is commonly done in Templates...

<noinclude>
<languages/>
</noinclude>

Change 941492 had a related patch set uploaded (by Reedy; author: Reedy):

[mediawiki/extensions/Translate@master] ext.translate.special.pagepreparation.js: Only add trailing </translate> if we added a leading <translate>

https://gerrit.wikimedia.org/r/941492

Change 941492 merged by jenkins-bot:

[mediawiki/extensions/Translate@master] ext.translate.special.pagepreparation.js: Only add trailing </translate> if we added a leading <translate>

https://gerrit.wikimedia.org/r/941492

Change 941867 had a related patch set uploaded (by Reedy; author: Reedy):

[mediawiki/extensions/Translate@REL1_40] ext.translate.special.pagepreparation.js: Only add trailing </translate> if we added a leading <translate>

https://gerrit.wikimedia.org/r/941867

Change 941868 had a related patch set uploaded (by Reedy; author: Reedy):

[mediawiki/extensions/Translate@REL1_39] ext.translate.special.pagepreparation.js: Only add trailing </translate> if we added a leading <translate>

https://gerrit.wikimedia.org/r/941868

Change 941870 had a related patch set uploaded (by Reedy; author: Reedy):

[mediawiki/extensions/Translate@REL1_35] ext.translate.special.pagepreparation.js: Only add trailing </translate> if we added a leading <translate>

https://gerrit.wikimedia.org/r/941870

Change 941868 merged by jenkins-bot:

[mediawiki/extensions/Translate@REL1_39] ext.translate.special.pagepreparation.js: Only add trailing </translate> if we added a leading <translate>

https://gerrit.wikimedia.org/r/941868

Change 941867 merged by jenkins-bot:

[mediawiki/extensions/Translate@REL1_40] ext.translate.special.pagepreparation.js: Only add trailing </translate> if we added a leading <translate>

https://gerrit.wikimedia.org/r/941867

Reedy closed this task as Resolved.EditedJul 26 2023, 10:46 AM

It doesn't result in what was actually requested to begin with, but it handles some of these cases better, and doesn't duplicate syntax either.

As per T245811#9042640

<noinclude>
<languages/>
</noinclude>

should probably be used, rather than

<noinclude><languages/></noinclude>

Change 941870 merged by jenkins-bot:

[mediawiki/extensions/Translate@REL1_35] ext.translate.special.pagepreparation.js: Only add trailing </translate> if we added a leading <translate>

https://gerrit.wikimedia.org/r/941870