Page MenuHomePhabricator

MediaWiki shouldn't assign section ids during tokenization, but instead only when headings are generated
Open, Needs TriagePublic

Description

Consider this wikitext:

==A==
{{#if:
==B==
}}
==C==

The PHP parser will emit:

<h2>...<a href="/~cananian/mediawiki/index.php?title=CLIParser&amp;action=edit&amp;section=1" title="Edit section: A">edit source</a>...</h2>
<h2>...<a href="/~cananian/mediawiki/index.php?title=CLIParser&amp;action=edit&amp;section=3" title="Edit section: C">edit source</a>...</h2>

Note that the section id for C is 3, not 2. This is because the preprocessor assigns section ids as soon as it sees the heading tokens ==...==, irrespective of whether it is inside a template argument or not.

The preprocessor should instead look at its template nesting tree to determine whether it is in a template argument before assigning section ids.

Parsoid was updated to match PHP's behavior in T213468; when this bug is fixed in core Parsoid should be re-simplified to match.

Event Timeline

I think the index could be assigned when "possible-h" nodes are promoted to "h" nodes, around line 785 of Preprocessor_Hash.php.

Change 730755 had a related patch set uploaded (by Tim Starling; author: Tim Starling):

[mediawiki/core@master] Preprocessor: Don't assign a heading index to a possible-h node

https://gerrit.wikimedia.org/r/730755

Change 730755 abandoned by Tim Starling:

[mediawiki/core@master] Preprocessor: Don't assign a heading index to a possible-h node

Reason:

The existing section numbering algorithm is a reasonable compromise between Parsoid's needs and MediaWiki's needs

https://gerrit.wikimedia.org/r/730755