Page MenuHomePhabricator

HTML tags represented with escaped < and > chars aren't always nowikied properly
Closed, ResolvedPublic

Description

This is self-explanatory

[subbu@earth lib] echo '&lt;h2&gt;foo&lt;/h2&gt;' | node parse --html2wt 
<h2>foo</h2>
[subbu@earth lib] echo '&lt;a rel="mw:ExtLink" href="http://www.google.com"&gt;Google&lt;/a&gt;' | node parse --html2wt
<a rel="mw:ExtLink" href="http://www.google.com">Google</a>

Those strings need to be nowikied

Event Timeline

ssastry raised the priority of this task from to Needs Triage.
ssastry updated the task description. (Show Details)
ssastry subscribed.
ssastry set Security to None.
ssastry moved this task from Needs Triage to VE Q3 on the Parsoid board.
ssastry lowered the priority of this task from High to Medium.Mar 25 2015, 7:12 PM
ssastry moved this task from VE Q3 to In Progress on the Parsoid board.
Jdforrester-WMF subscribed.

Removed this from the VE Q3 blockers list per Subbu.

Are we sure about the second case (&lt;a&gt;…)? Given that explicit <a> tags aren't allowed in wikitext, Parsoid already encodes it correctly:

echo '&lt;a rel="mw:ExtLink" href="http://www.google.com"&g t;Google&lt;/a&gt;' | node parse --html2html
…
<p data-parsoid='{"dsr":[0,59,0,0]}'>&lt;a rel="mw:ExtLink" href="http://www.google.com">Google&lt;/a></p>

Ah yes .. that is true. That said, I think editors find the a-tags disconcerting and the nowiki might serve as a visual cue that it is just text. If the code to handle these scenarios is going to be generic, you can handle it uniformly. If it looks like you have to add special cases to the code for the <a> tag scenario, then don't bother. This is a rare edge case that is not worth the complexity unless someone explicitly demands it.

Change 210712 had a related patch set uploaded (by Marcoil):
T93824: Put escaped HTML tags inside <nowiki>

https://gerrit.wikimedia.org/r/210712

Change 210712 merged by jenkins-bot:
T93824: Put escaped HTML tags inside <nowiki>

https://gerrit.wikimedia.org/r/210712