Page MenuHomePhabricator

Some categories are not represented in Parsoid output
Closed, ResolvedPublic

Description

Some parser functions (and other things) add tracking categories directly, rather than emitting the wikitext [[Category:whatever]] and letting the parser add them. Parsoid currently doesn't include these categories when it parses pages. To fix this, when it calls the API's expandtemplates, it needs to request and use its "categories" output. A simple way to test this is to add {{#invoke:this is invalid}} to a page. If this bug is fixed, the page will end up in Category:Pages_with_script_errors.


Version: unspecified
Severity: normal

Details

Reference
bz70196

Event Timeline

bzimport raised the priority of this task from to Normal.Nov 22 2014, 3:38 AM
bzimport added a project: Parsoid.
bzimport set Reference to bz70196.

How should these "hidden" categories be represented in Parsoid's HTML?

As a proof of concept, I've pushed https://gerrit.wikimedia.org/r/#/c/164054/ which adds each one of them as a <link rel="mw:PageProp/Category"> in the page's <head>.

If the expandtemplates call gives you the categories for the expansion (template / extension), you can add them as part of the template/expansion output itself.

For extensions, this should be straightforward since you get expanded HTML back and you can add these categories as well.

For template output ... we should chat about it .. but you could update the wikitext with category wikitext that you generate and append to the expanded wikitext so that when it gets parsed, it gets wrapped as part of the template.

Change 164054 had a related patch set uploaded by Marcoil:
Bug 70196: Add tracking categories from extensions to <head>

https://gerrit.wikimedia.org/r/164054

Change 164054 merged by jenkins-bot:
Bug 70196: Add categories added directly from extensions

https://gerrit.wikimedia.org/r/164054

This doesn't quite work right. Details on the gerrit change.

Change 164333 had a related patch set uploaded by Marcoil:
Bug 70196: Add all categories from action=parse

https://gerrit.wikimedia.org/r/164333

Change 164333 merged by jenkins-bot:
Bug 70196: Add all categories from action=parse

https://gerrit.wikimedia.org/r/164333