Page MenuHomePhabricator

<span dir="ltr"> in headings is ignored when the heading is shown in table of contents
Closed, ResolvedPublic

Description

In an RTL Wikipedia a heading like "==<span dir="ltr">C++</span>==" shows correctly as a heading in the text flow, but it is shown as "++C" in the table of contents, because the <span> tag is omitted and an RTL direction is assumed.

Not all HTML is omitted - a heading like "==E = mc<sup>2</sup>==" shows correctly both in the heading and in the table of contents.


Version: unspecified
Severity: normal

Details

Reference
bz35167

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 12:15 AM
bzimport set Reference to bz35167.

EN.WP.ST47 wrote:

Confirmed on 1.19wmf1. From the parser:

$tocline = preg_replace(
    array( '#<(?!/?(sup|sub|i|b)(?: [^>]*)?>).*?'.'>#', '#<(/?(sup|sub|i|b))(?: .*?)?'.'>#' ),
    array( '',                          '<$1>' ),
    $safeHeadline
);

Those regexen are rather ugly, but let's see if we can't add a very limited allowance for span:

$tocline = preg_replace(
    array( '#<(?!/?(sup|sub|i|b|span dir="ltr")(?: [^>]*)?>).*?'.'>#', '#<(/?(sup|sub|i|b|span dir="ltr"))(?: .*?)?'.'>#' ),
    array( '',                          '<$1>' ),
    $safeHeadline
);

But I have to ask - someone went through the effort of not having a ?> in that regex in two different places, and then left a ?> in another place, and I notice that the world hasn't exploded.

This is still an issue. not critical, but annoying.

Thank you for the tip, Dan.

Proposed fix submitted in https://gerrit.wikimedia.org/r/#/c/22435/ . I'm working on tests for it.

Patch improved and tests added. Thank to anybody who can review it.