Page MenuHomePhabricator

Reference names can't contain square brackets in HTML5 fragment mode
Closed, ResolvedPublic

Description

"In HTML4 mode, wikitext characters get escaped in the anchor name, e.g. <ref name="foo [[bar]]" /> generates wikitext like <sup id="cite_ref-foo_.5B.5Bbar.5D.5D_0-0" class="reference">[1]</sup>; note the brackets have changed to ".5B" and ".5D". In HTML5 mode, these characters aren't escaped, generating <sup id="cite_ref-foo_[[bar]]_0-0" class="reference">#cite_note-foo_[[bar-0|[1]]]</sup> instead, with brackets inside the attempted wikilink. Since brackets cannot actually appear inside a wikilink, boom. Anomie⚔"

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:33 PM
bzimport added a project: Cite.
bzimport set Reference to bz27694.
bzimport added a subscriber: Unknown Object (MLST).

ayg wrote:

This is actually due to $wgExperimentalHtmlIds, not $wgHtml5. It's just that the former does nothing unless the latter is true. This should be fixable in Cite by not encoding the ref name in the anchor name, just generate an arbitrary one.

(In reply to comment #0)

"In HTML4 mode, wikitext characters get escaped in the anchor name, e.g. <ref
name="foo [[bar]]" /> generates wikitext like <sup
id="cite_ref-foo_.5B.5Bbar.5D.5D_0-0"
class="reference">[1]</sup>; note the
brackets have changed to ".5B" and ".5D". In HTML5 mode, these characters
aren't escaped, generating <sup id="cite_ref-foo_[[bar]]_0-0"
class="reference">#cite_note-foo_[[bar-0|[1]]]</sup> instead, with brackets
inside the attempted wikilink. Since brackets cannot actually appear inside a
wikilink, boom. Anomie⚔"

Is this still an issue? According to https://noc.wikimedia.org/conf/InitialiseSettings.php.txt:


'wgHtml5' => array(
'default' => false,
'mediawikiwiki' => true,
'testwiki' => true,
'test2wiki' => true,

),

Using the following wikitext on test.wikipedia.org (testwiki) at https://test.wikipedia.org/wiki/Cite_using_HTML5:


Hello there.<ref name="foo [[bar]]">author</ref><ref name="foo [[bar]]" />

References

<references />

Generated HTML looks like this:


<p>Hello there.<sup id="cite_ref-foo_.5B.5Bbar.5D.5D_0-0" class="reference"><a href="#cite_note-foo_.5B.5Bbar.5D.5D-0">1</a><span class="reference_comma">,</span></sup><sup id="cite_ref-foo_.5B.5Bbar.5D.5D_0-1" class="reference"><a href="#cite_note-foo_.5B.5Bbar.5D.5D-0">1</a><span class="reference_comma">,</span></sup></p>
<h2> <span class="mw-headline" id="References">References</span></h2>
<ol class="references">
<li id="cite_note-foo_.5B.5Bbar.5D.5D-0"><span class="mw-cite-backlink">^ <span class="citerefmanylink"><a href="#cite_ref-foo_.5B.5Bbar.5D.5D_0-0"><sup><i><b>a</b></i></sup></a></span> <span class="citerefmanylink"><a href="#cite_ref-foo_.5B.5Bbar.5D.5D_0-1"><sup><i><b>b</b></i></sup></a></span></span> <span class="reference-text">author</span></li>

</ol>

This bug appears to be fixed. Can someone confirm?

(In reply to comment #2)

Using the following wikitext on test.wikipedia.org (testwiki) at

Does test.wikipedia.org have $wgExperimentalHtmlIds on or off?

ayg wrote:

Well, it defaults to false, because it's experimental:

https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=blob;f=includes/DefaultSettings.php;h=35ae6bda3c99ac00fab215d39b71d8bb7a1ff5f1;hb=HEAD#l2500

If no one is actively developing it and fixing its bugs, it should probably remain false, or just be removed. I hope no one tried turning it on anywhere.

(In reply to comment #4)

I hope no one tried turning it on anywhere.

I guess someone did in February 2011.

MaxSem renamed this task from HTML5 (with experimental ids) anchor encoding bug (refs only?) to Reference names can't contain square brackets in HTML5 fragment mode.Nov 10 2017, 12:53 AM
MaxSem claimed this task.
MaxSem added a project: Community-Tech-Sprint.
MaxSem updated the task description. (Show Details)

Change 390356 had a related patch set uploaded (by MaxSem; owner: MaxSem):
[mediawiki/extensions/Cite@master] Don't break when reference names contain []

https://gerrit.wikimedia.org/r/390356

Change 391135 had a related patch set uploaded (by MaxSem; owner: MaxSem):
[mediawiki/core@master] Sanitizer::safeEncodeAttribute(): also encode ]

https://gerrit.wikimedia.org/r/391135

Change 390356 had a related patch set uploaded (by Legoktm; owner: MaxSem):
[mediawiki/extensions/Cite@master] Don't break when reference names contain []

https://gerrit.wikimedia.org/r/390356

Change 391135 merged by jenkins-bot:
[mediawiki/core@master] Sanitizer::safeEncodeAttribute(): also encode ]

https://gerrit.wikimedia.org/r/391135

Change 390356 merged by jenkins-bot:
[mediawiki/extensions/Cite@master] Don't break when reference names contain []

https://gerrit.wikimedia.org/r/390356