Page MenuHomePhabricator

MediaWiki shouldn't include '<span></span>' in the TOC text
Closed, ResolvedPublic

Description

Visit http://en.wikipedia.org/w/api.php?action=parse&page=USB&prop=sections
Some of the entries have crazy html in them, such as (see the part that says "line":

{
    "toclevel": 3,
    "level": "4",
    "line": "<span></span><span></span><span></span><span></span><span></span>Prereleases",
    "number": "2.1.1",
    "index": "4",
    "fromtitle": "USB",
    "byteoffset": 13726,
    "anchor": "Prereleases"
},

This is due to the markup for that section being

==== {{anchor|0.7|0.8|0.9|0.99|1.0RC}}Prereleases ====

which translates to

==== <span id="0.7"></span><span id="0.8"></span><span id="0.9"></span><span id="0.99"></span><span id="1.0RC"></span>Prereleases ====

and when the parser strips attributes but leaves the <span> tags it gives the result shown. Tidy (probably) fixes it on WMF sites, but why generate it in the first place?

Event Timeline

robertchin raised the priority of this task from to High.
robertchin updated the task description. (Show Details)
robertchin added a project: MediaWiki-API.
robertchin added a subscriber: robertchin.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 15 2015, 4:21 PM
Anomie added a subscriber: Anomie.Apr 15 2015, 5:48 PM

This is, in fact, the text that MediaWiki is producing for the TOC for that section (although Tidy is probably cleaning it up on WMF sites). The cause is Gerrit change 22435 allowing <span dir="..."> to be copied to the TOC, which has the side effect that it also has to copy any other span (sans parameters; copying id would break stuff anyway), combined with the fact that {{anchor}} produces these otherwise-empty spans.

I'm going to repurpose this bug as "MediaWiki shouldn't include '<span></span>' in the TOC text".

Anomie renamed this task from api.php query to get sections returning erroneous data to MediaWiki shouldn't include '<span></span>' in the TOC text.Apr 15 2015, 5:53 PM
Anomie lowered the priority of this task from High to Low.
Anomie updated the task description. (Show Details)
Anomie edited projects, added MediaWiki-Parser; removed MediaWiki-API.
Anomie set Security to None.

Change 204299 had a related patch set uploaded (by Anomie):
Parser: Avoid producing <span></span> in the TOC

https://gerrit.wikimedia.org/r/204299

Is there a timeline for when this will get merged?

Umherirrender closed this task as Resolved.Jul 8 2015, 5:12 PM
Umherirrender assigned this task to Anomie.

Change 204299 merged by jenkins-bot:
Parser: Avoid producing <span></span> in the TOC

https://gerrit.wikimedia.org/r/204299