Page MenuHomePhabricator

Make RelatedArticles extension compatible with Parsoid
Open, MediumPublic

Description

Parsoid has its own extension API - see https://www.mediawiki.org/wiki/Parsoid/Extension_API.

The RelatedArticles extension needs an update to work directly with Parsoid so that we can switch Wikimedia wikis to use Parsoid instead of core parser in late 2021.

The Parsing Team will work with you as required.

Looking at the code, it appears that it uses the setFunctionHook API. ParsoidExtensionAPI is yet to add this support. But, looking at the extension code, once that is done, the actual retargeting work should be quite straightforward and simple.

Event Timeline

Jdlrobson added subscribers: JTannerWMF, ovasileva, Jdlrobson.

Please contact Jazmin or Olga to set expectations about timeline and ensure this work gets scheduled.

Yes web team maintain this extension. Please reach out to @ovasileva to make sure time is allocated to do this work.

Jdlrobson changed the task status from Open to Stalled.Nov 29 2023, 12:22 AM

Also currently it's not clear what exactly you need the web team to do - from the ticket my understanding is ParsoidExtensionAPI needs to add support for setFunctionHook API before we can go anything. Is there a ticket tracking that work?

The RelatedArticles extension registers a parser function for {{#related|page1|page2}}. That should be supported with Parsoid currently, and the plan (T268144#7950406) is to maintain legacy compatibility for these simple text-in/text-out parser functions (although the legacy API might shift to get rid of the ParserFIrstCallInit hook, but that's an orthogonal issue, T299528).

The main incompatibility I can see is in:

	public static function onFuncRelated( Parser $parser, ...$args ) {
		$parserOutput = $parser->getOutput();
		$relatedPages = $parserOutput->getExtensionData( 'RelatedArticles' );
		if ( !$relatedPages ) {
			$relatedPages = [];
		}

		// Add all the related pages passed by the parser function
		// {{#related:Test with read more|Foo|Bar}}
		foreach ( $args as $relatedPage ) {
			$relatedPages[] = $relatedPage;
		}
		$parserOutput->setExtensionData( 'RelatedArticles', $relatedPages );

		return '';
	}

The use of getExtensionData and setExtensionData in this read-modify-write sort of way is deprecated (T300981) and because Parsoid processes the parser functions invocations independently will currently result in throwing away the data from all but the last invocation of {{#related}}. If there's typically only one invocation per page that might not be immediately obvious.

In any case, the fix is simple: the new-ish ParserOutput::appendExtensionData was added specifically to allow composition of lists of this sort. Switching to use that function should eliminate the Parsoid incompatibility AFAICT.

Do you just need code review/testing for this? If the fix is simple it is something the content transform team plans to post patches for?

Do you just need code review/testing for this? If the fix is simple it is something the content transform team plans to post patches for?

That's a good question! Ideally we would work in a way that shares knowledges about the Parsoid APIs with everyone involved since Parsoid is expected to be the long-term MW parser.

My idea is to prioritise CTT patching for unmaintained Extensions and support owners to work on their extensions with our review/support. How that sounds to you?

Jdlrobson changed the task status from Stalled to Open.Dec 5 2023, 5:33 PM
Jdlrobson assigned this task to ovasileva.

Yep thanks! Passing to Olga to prioritize then.

ovasileva triaged this task as Medium priority.Dec 5 2023, 5:34 PM