Page MenuHomePhabricator

Parsoid shows bogus section edit links
Closed, ResolvedPublicBUG REPORT

Description

Steps to reproduce

  1. Open https://hu.wiktionary.org/w/index.php?title=sohase&useparsoid=0.
  2. Notice that there are no section edit links (even though both sections use ==, not <h#>).
  3. Open https://hu.wiktionary.org/w/index.php?title=sohase&useparsoid=1.
  4. Notice that the section edit links are there. Yay!
  5. Click on either of them.

Actual result

  1. You get a “cannot find section” error, because the link pointed to the template that’s directly transcluded in the article, but the section title is even deeper in the transclusion chain.

Expected result

  1. Either the section edit links work, or they aren’t there. (Since the legacy parser doesn’t show them either, not showing them wouldn’t be a regression.)

Event Timeline

Hmm, an </includeonly> on the same line as the heading suppresses the section edit link in the legacy parser.

For example, a template with the content,

<includeonly>
=== Hmm ===</includeonly>

won't produce a section edit link, but the following will,

<includeonly>
=== Hmm ===
</includeonly>

The second heading (További információk) doesn’t have any </includeonly>s, yet it still doesn’t get a section edit link. Going down the transclusion path:

TemplateMarkupLegacy read viewParsoid read view
Template:hnothing markup other than template transclusionno section edit linksbroken section edit links (numbered continuously, not even considering that the second heading comes from a different template than the first one)
Template:hu-linkssome HTML comments, but far away, and still no <includeonly> markupworking section edit link through transclusionbroken section edit link through transclusion (redirect not resolved)
Template:Further readingjust a redirect
Template:infosthe section syntax directly; complicated wikitext within it, but nothing around it; no <includeonly> markup at allworking section edit link (no transclusion)working section edit link (no transclusion)

So it’s pretty clear to me that the legacy parser simply gives up after a certain level of indirection instead of trying and failing, while Parsoid does try and fail: it simply numbers sections continuously, not considering that

  • If there are multiple templates on the page, they aren’t related to each other, so the section numbers should start from 1 for each template
  • The template name it uses may be a redirect, which should be resolved before generating a section edit link
  • The section may come via multiple levels of transclusion

It looks like I managed to find a perfect example, which demonstrates many ways in which Parsoid is broken. ;)

So it’s pretty clear to me that the legacy parser simply gives up after a certain level of indirection instead of trying and failing,

No, in Template:h there's {{címszó-nyelvkód|hu}} which expands to

__NOEDITSECTION__ 
==[[:w:magyar nyelv|Magyar]]==

The magic word hides the section edit links, but Parsoid isn't aware of that yet.

while Parsoid does try and fail: it simply numbers sections continuously, not considering that
If there are multiple templates on the page, they aren’t related to each other, so the section numbers should start from 1 for each template

Yup

The template name it uses may be a redirect, which should be resolved before generating a section edit link

Yup

The section may come via multiple levels of transclusion

Until Parsoid has native template expansion, this may be a fundamental limitation.

Change #1052832 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/services/parsoid@master] [WIP] Set POF::NO_SECTION_EDIT_LINKS from behavior switch

https://gerrit.wikimedia.org/r/1052832

So it’s pretty clear to me that the legacy parser simply gives up after a certain level of indirection instead of trying and failing,

No, in Template:h there's {{címszó-nyelvkód|hu}} which expands to

__NOEDITSECTION__ 
==[[:w:magyar nyelv|Magyar]]==

I could swear I looked for __NOEDITSECTION__s everywhere…

The section may come via multiple levels of transclusion

Until Parsoid has native template expansion, this may be a fundamental limitation.

Couldn’t Parsoid

  • either ask the legacy parser (I assume it uses that) to add the section edit links in expanded templates or to provide the data necessary for adding the correct ones,
  • or notice if a heading comes through multiple levels of expansion, and hide the section edit link in that case (keeping the ones coming from direct transclusion)?

I think even the latter case is better than the current one: it doesn’t fix the broken links, but at least hides them (while keeping the working ones).

ABreault-WMF moved this task from Backlog to In Progress on the Content-Transform-Team-WIP board.

Change #1053075 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/services/parsoid@master] [WIP] Reset section numbering per template

https://gerrit.wikimedia.org/r/1053075

Change #1053387 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/services/parsoid@master] Add a test for section titles failing to resolve redirected templates

https://gerrit.wikimedia.org/r/1053387

I could swear I looked for __NOEDITSECTION__s everywhere…

Try Special:ExpandTemplates

  • either ask the legacy parser (I assume it uses that)

It does

to add the section edit links in expanded templates

Section edit links are only relevant to what we call "read views". The way the pipeline is constructed, Parsoid parses wikitext to a canonical output, which gets cached. Then, depending on the calling client, the cached output is transformed. See core's DefaultOutputPipelineFactory for the stages that are applied for desktop clients, one of which is HandleParsoidSectionLinks. These section edit links are undesirable for editing clients, like VE, or the mobile apps, etc. If the legacy parser was asked to apply them during preprocessing then they'd be in the cached output.

or to provide the data necessary for adding the correct ones,

That's effectively native template expansion. Eventually we'll get there but it's the current limitation.

  • or notice if a heading comes through multiple levels of expansion, and hide the section edit link in that case (keeping the ones coming from direct transclusion)?

I think even the latter case is better than the current one: it doesn’t fix the broken links, but at least hides them (while keeping the working ones).

Unfortunately, Parsoid only sees the flattened wikitext output post- template expansion.

Change #1052832 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Set POF::NO_SECTION_EDIT_LINKS from behavior switch

https://gerrit.wikimedia.org/r/1052832

Change #1054291 had a related patch set uploaded (by Isabelle Hurbain-Palatin; author: Isabelle Hurbain-Palatin):

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.20.0-a13

https://gerrit.wikimedia.org/r/1054291

Change #1054291 merged by jenkins-bot:

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.20.0-a13

https://gerrit.wikimedia.org/r/1054291

Change #1053075 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Reset section numbering per template

https://gerrit.wikimedia.org/r/1053075

Change #1053387 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Add a test for section titles failing to resolve redirected templates

https://gerrit.wikimedia.org/r/1053387

Post-deploy, Parsoid now respects __NOEDITSECTION__ and the pages from the description now look the same, at least in terms of the section edit links,
https://hu.wiktionary.org/w/index.php?title=sohase&useparsoid=0
https://hu.wiktionary.org/w/index.php?title=sohase&useparsoid=1

The list on that page is styled differently though. It looks like an adjacent sibling selector (.ouac-acoln+ul or similar) is broken because Parsoid wraps whitespace in between in spans.

<div class="ouac-acoln ouac-acol4" about="#mwt1" id="mwGA"></div><span about="#mwt1" id="mwGQ">
</span><ul about="#mwt1" id="mwGg">...

That should be filed separately to consider our options.

In any case, there are more patches to deploy in this one before we can close it out.

Change #1055878 had a related patch set uploaded (by Isabelle Hurbain-Palatin; author: Isabelle Hurbain-Palatin):

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.20.0-a14

https://gerrit.wikimedia.org/r/1055878

Change #1055878 merged by jenkins-bot:

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.20.0-a14

https://gerrit.wikimedia.org/r/1055878

The list on that page is styled differently though. It looks like an adjacent sibling selector (.ouac-acoln+ul or similar) is broken because Parsoid wraps whitespace in between in spans.

<div class="ouac-acoln ouac-acol4" about="#mwt1" id="mwGA"></div><span about="#mwt1" id="mwGQ">
</span><ul about="#mwt1" id="mwGg">...

That should be filed separately to consider our options.

T370751: Parsoid breaks next-sibling TemplateStyles hack