Page MenuHomePhabricator

Flow\Exception\WikitextException: ParseEntityRef: no name
Closed, ResolvedPublic

Description

Error

Comes from requests made to https://www.mediawiki.org/wiki/Thread:Project:Support_desk/Extension:_BreadCrumbs_2_troubleshooting Seems like someone triggers an invalid syntax in Parsoid/Wikitext using Flow as a way to insert text

Request ID: INSERT_ID

message
/wiki/Thread:Project:Support_desk/Extension:_BreadCrumbs_2_troubleshooting   Flow\Exception\WikitextException from line 428 of /srv/mediawiki/php-1.33.0-wmf.20/extensions/Flow/includes/Conversion/Utils.php: htmlParseEntityRef: no name

htmlParseEntityRef: no name


From source content:
<html><head></head><body data-parsoid='{"dsr":[0,2967,0,0]}' lang="en" class="mw-content-ltr sitedir-ltr ltr mw-body mw-body-content mediawiki" dir="ltr">
<p data-parsoid='{"dsr":[0,90,0,0]}'>It's close to working, however not fully. I must be messing something up with the Syntax. </p>
...
trace
#0 /srv/mediawiki/php-1.33.0-wmf.20/extensions/Flow/includes/Parsoid/ContentFixer.php(96): Flow\Conversion\Utils::createDOM(string)
#1 /srv/mediawiki/php-1.33.0-wmf.20/extensions/Flow/includes/Parsoid/ContentFixer.php(55): Flow\Parsoid\ContentFixer::createDOM(string)
#2 /srv/mediawiki/php-1.33.0-wmf.20/extensions/Flow/includes/Parsoid/ContentFixer.php(43): Flow\Parsoid\ContentFixer->apply(string, Title)
#3 /srv/mediawiki/php-1.33.0-wmf.20/extensions/Flow/includes/Templating.php(157): Flow\Parsoid\ContentFixer->getContent(Flow\Model\PostRevision)
#4 /srv/mediawiki/php-1.33.0-wmf.20/extensions/Flow/includes/Formatter/RevisionFormatter.php(284): Flow\Templating->getContent(Flow\Model\PostRevision, string)
#5 /srv/mediawiki/php-1.33.0-wmf.20/extensions/Flow/includes/Formatter/TopicFormatter.php(45): Flow\Formatter\RevisionFormatter->formatApi(Flow\Formatter\TopicRow, Flow\View)
#6 /srv/mediawiki/php-1.33.0-wmf.20/extensions/Flow/includes/Block/Topic.php(651): Flow\Formatter\TopicFormatter->formatApi(Flow\Model\Workflow, array, Flow\View)
#7 /srv/mediawiki/php-1.33.0-wmf.20/extensions/Flow/includes/Block/Topic.php(555): Flow\Block\TopicBlock->renderTopicApi(array)
#8 /srv/mediawiki/php-1.33.0-wmf.20/extensions/Flow/includes/View.php(233): Flow\Block\TopicBlock->renderApi(array)
#9 /srv/mediawiki/php-1.33.0-wmf.20/extensions/Flow/includes/View.php(71): Flow\View->buildApiResponse(Flow\WorkflowLoader, array, string, array)
#10 /srv/mediawiki/php-1.33.0-wmf.20/extensions/Flow/includes/Actions/Action.php(112): Flow\View->show(Flow\WorkflowLoader, string)
#11 /srv/mediawiki/php-1.33.0-wmf.20/extensions/Flow/includes/Actions/ViewAction.php(20): Flow\Actions\FlowAction->showForAction(string, OutputPage)
#12 /srv/mediawiki/php-1.33.0-wmf.20/extensions/Flow/includes/Actions/Action.php(50): Flow\Actions\ViewAction->showForAction(string)
#13 /srv/mediawiki/php-1.33.0-wmf.20/includes/MediaWiki.php(501): Flow\Actions\FlowAction->show()
#14 /srv/mediawiki/php-1.33.0-wmf.20/includes/MediaWiki.php(294): MediaWiki->performAction(Article, Title)
#15 /srv/mediawiki/php-1.33.0-wmf.20/includes/MediaWiki.php(867): MediaWiki->performRequest()
#16 /srv/mediawiki/php-1.33.0-wmf.20/includes/MediaWiki.php(517): MediaWiki->main()
#17 /srv/mediawiki/php-1.33.0-wmf.20/index.php(42): MediaWiki->run()
#18 /srv/mediawiki/w/index.php(3): include(string)
#19 {main}

Impact

Notes

Event Timeline

hashar created this task.Mar 6 2019, 1:15 PM
hashar triaged this task as High priority.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 6 2019, 1:15 PM

Seems like Flow is unable to process a result from Parsoid? :/

Restricted Application added a project: Growth-Team. · View Herald TranscriptMar 6 2019, 1:16 PM
kostajh added a comment.EditedMar 6 2019, 3:17 PM

Copy/pasting the source content into VE mode in Flow and VisualEditor on a page in wmf.20 returns identical HTML:

<p id=\"mwAQ\">&lt;html>\n&lt;head>&lt;/head>\n&lt;body data-parsoid='{\"dsr\":[0,2967,0,0]}' lang=\"en\"\n
    class=\"mw-content-ltr sitedir-ltr ltr mw-body mw-body-content mediawiki\" dir=\"ltr\"></p>\n<p
        data-x-data-parsoid='{\"dsr\":[0,90,0,0]}' id=\"mwAg\">It's close to working, however not fully. I must be
    messing something up with the\n Syntax. </p>\n\n<p data-x-data-parsoid='{\"dsr\":[92,317,0,0]}' id=\"mwAw\">It
    should read = DIYAutoWiki Home <span typeof=\"mw:Entity\" id=\"mwBA\">></span> Manufacturers <span
            typeof=\"mw:Entity\" id=\"mwBQ\">></span> BMW <span typeof=\"mw:Entity\" id=\"mwBg\">></span> BMW 1 Series\n
    and 2 Series <span typeof=\"mw:Entity\" id=\"mwBw\">></span> E81 (2007 - 2012)\n <br
            data-x-data-parsoid='{\"stx\":\"html\",\"selfClose\":true,\"dsr\":[196,202,6,0]}' id=\"mwCA\"/>\n Instead
    it's reading = DIYAutoWiki Home <span typeof=\"mw:Entity\" id=\"mwCQ\">></span> Manufacturers <span
            typeof=\"mw:Entity\" id=\"mwCg\">></span> BMW <span typeof=\"mw:Entity\" id=\"mwCw\">></span> BMW 1 Series
    and 2 Series <span typeof=\"mw:Entity\" id=\"mwDA\">></span>* E81 (2007 -\n 2012) E88</p>\n\n<p
        data-x-data-parsoid='{\"dsr\":[319,448,0,0]}' id=\"mwDQ\">I have no clue why the asterisk and the \"E88\" are
    showing up. This has to be an\n error in my formatting and I can't figure it out.</p>\n\n<p
        data-x-data-parsoid='{\"dsr\":[450,544,0,0]}' id=\"mwDg\">Here is how I have it setup. Hopefully a fresh set of
    eyes can tell me where I\n am going wrong:</p>\n\n<p data-x-data-parsoid='{\"dsr\":[546,546,0,0]}' id=\"mwDw\"><br
        data-x-data-parsoid='{\"dsr\":[546,546,0,0]}' id=\"mwEA\"/></p>\n
<div class=\"mw-highlight
     mw-content-ltr\" dir=\"ltr\" data-x-typeof=\"mw:Extension/syntaxhighlight\" data-x-data-mw='{\"name\":\"syntaxhighlight\",\"attrs\":{\"source\":\"\",\"lang\":\"xml\"},\"body\":{\"extsrc\":\"\\n* Main Page @\\n* default   @ [[Main Page|DIYAutoWiki Home]] >\\n* Manufacturers @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] >\\n\\n\\n\\n----\\n\\n\\n\\n* Bmw @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|Bmw]] >\\n\\n\\n\\n* Bmw 1 series and 2 series @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > \\n\\n* E81 (2007 - 2012) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] >  [[:Category:E81 (2007 - 2012)|E81 (2007 - 2012)]] >\\n\\n* E82 (2007 - 2013) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] >  [[:Category:E82 (2007 - 2013)|E82 (2007 - 2013)]] >\\n\\n* E87 (2004 - 2011) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:E87 (2004 - 2011)|E87 (2004 - 2011)]] >\\n\\n* E88 (2007 - 2013) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:E88 (2007 - 2013)|E88 (2007 - 2013)]] >\\n\\n* F20 (2011 - 2014) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:F20 (2011 - 2014)|F20 (2011 - 2014)]] >\\n\\n* F22 (2014) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:F22 (2014)|F22 (2014)]] >\\n\\n* F23 (2014) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:F23 (2014)|F23 (2014)]] >\\n\\n\"}}' data-x-data-parsoid='{\"dsr\":[547,2841,2,2]}' data-x-about=\"#mwt3\" id=\"mwEQ\">\n
<pre typeof=\"mw:Extension/pre\" about=\"#mwt3\"
     data-mw='{\"name\":\"pre\",\"attrs\":{},\"body\":{\"extsrc\":\"* Main Page @\\n* default   @ [[Main Page|DIYAutoWiki Home]] &amp;gt;\\n* Manufacturers @ [[Main Page|DIYAutoWiki Home]] &amp;gt; [[:Category:Manufacturers|Manufacturers]] &amp;gt;\\n\\n\\n\\n----\\n\\n\\n&lt;span class=\\\"c\\\">&amp;lt;!-- Begin BMW --&amp;gt;&lt;/span>\\n* Bmw @ [[Main Page|DIYAutoWiki Home]] &amp;gt; [[:Category:Manufacturers|Manufacturers]] &amp;gt; [[:Category:Bmw|Bmw]] &amp;gt;\\n\\n\\n&lt;span class=\\\"c\\\">&amp;lt;!-- Begin BMW 1 Series &amp;amp; 2 Series --&amp;gt;&lt;/span>\\n* Bmw 1 series and 2 series @ [[Main Page|DIYAutoWiki Home]] &amp;gt; [[:Category:Manufacturers|Manufacturers]] &amp;gt; [[:Category:Bmw|BMW]] &amp;gt; [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] &amp;gt;\\n\\n* E81 (2007 - 2012) @ [[Main Page|DIYAutoWiki Home]] &amp;gt; [[:Category:Manufacturers|Manufacturers]] &amp;gt; [[:Category:Bmw|BMW]] &amp;gt; [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] &amp;gt;  [[:Category:E81 (2007 - 2012)|E81 (2007 - 2012)]] &amp;gt;\\n\\n* E82 (2007 - 2013) @ [[Main Page|DIYAutoWiki Home]] &amp;gt; [[:Category:Manufacturers|Manufacturers]] &amp;gt; [[:Category:Bmw|BMW]] &amp;gt; [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] &amp;gt;  [[:Category:E82 (2007 - 2013)|E82 (2007 - 2013)]] &amp;gt;\\n\\n* E87 (2004 - 2011) @ [[Main Page|DIYAutoWiki Home]] &amp;gt; [[:Category:Manufacturers|Manufacturers]] &amp;gt; [[:Category:Bmw|BMW]] &amp;gt; [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] &amp;gt; [[:Category:E87 (2004 - 2011)|E87 (2004 - 2011)]] &amp;gt;\\n\\n* E88 (2007 - 2013) @ [[Main Page|DIYAutoWiki Home]] &amp;gt; [[:Category:Manufacturers|Manufacturers]] &amp;gt; [[:Category:Bmw|BMW]] &amp;gt; [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] &amp;gt; [[:Category:E88 (2007 - 2013)|E88 (2007 - 2013)]] &amp;gt;\\n\\n* F20 (2011 - 2014) @ [[Main Page|DIYAutoWiki Home]] &amp;gt; [[:Category:Manufacturers|Manufacturers]] &amp;gt; [[:Category:Bmw|BMW]] &amp;gt; [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] &amp;gt; [[:Category:F20 (2011 - 2014)|F20 (2011 - 2014)]] &amp;gt;\\n\\n* F22 (2014) @ [[Main Page|DIYAutoWiki Home]] &amp;gt; [[:Category:Manufacturers|Manufacturers]] &amp;gt; [[:Category:Bmw|BMW]] &amp;gt; [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] &amp;gt; [[:Category:F22 (2014)|F22 (2014)]] &amp;gt;\\n\\n* F23 (2014) @ [[Main Page|DIYAutoWiki Home]] &amp;gt; [[:Category:Manufacturers|Manufacturers]] &amp;gt; [[:Category:Bmw|BMW]] &amp;gt; [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] &amp;gt; [[:Category:F23 (2014)|F23 (2014)]] &amp;gt;\\n&lt;span class=\\\"c\\\">&amp;lt;!-- End BMW 1 Series &amp;amp; 2 Series --&amp;gt;&lt;/span>\\n\"}}'
     id=\"mwEg\">* Main Page @\n* default   @ [[Main Page|DIYAutoWiki Home]] >\n* Manufacturers @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] >\n\n\n\n----\n\n\n&lt;span class=\"c\">&lt;!-- Begin BMW -->&lt;/span>\n* Bmw @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|Bmw]] >\n\n\n&lt;span class=\"c\">&lt;!-- Begin BMW 1 Series &amp; 2 Series -->&lt;/span>\n* Bmw 1 series and 2 series @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] >\n\n* E81 (2007 - 2012) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] >  [[:Category:E81 (2007 - 2012)|E81 (2007 - 2012)]] >\n\n* E82 (2007 - 2013) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] >  [[:Category:E82 (2007 - 2013)|E82 (2007 - 2013)]] >\n\n* E87 (2004 - 2011) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:E87 (2004 - 2011)|E87 (2004 - 2011)]] >\n\n* E88 (2007 - 2013) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:E88 (2007 - 2013)|E88 (2007 - 2013)]] >\n\n* F20 (2011 - 2014) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:F20 (2011 - 2014)|F20 (2011 - 2014)]] >\n\n* F22 (2014) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:F22 (2014)|F22 (2014)]] >\n\n* F23 (2014) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:F23 (2014)|F23 (2014)]] >\n&lt;span class=\"c\">&lt;!-- End BMW 1 Series &amp; 2 Series -->&lt;/span>\n</pre>\n</div>\n\n
<p data-x-data-parsoid='{\"dsr\":[2843,2967,0,0]}' id=\"mwEw\">Sorry if this wasn't the best way to format this. You
    don't want to know how\n long it took me to accomplish posting this lol.</p>\n<p id=\"mwFA\">
    &lt;/body>\n&lt;/html></p>

The difference is that Flow throws exceptions if errors are found via libxml_get_errors(). Still investigating... I'm thinking the error in parsing might be due a change in VisualEditor in wmf.20. Changing VE to wmf.19 locally (with Flow at wmf.20) results in the problematic HTML getting parsed correctly.

Seems like the un-encoded ampersand in <!-- Begin BMW 1 Series & 2 Series --> and <!-- End BMW 1 Series & 2 Series --> in the syntax highlight data is causing the problem:

{"name":"syntaxhighlight","attrs":{"source":"","lang":"xml"},"body":{"extsrc":"\n* Main Page @\n* default   @ [[Main Page|DIYAutoWiki Home]] &gt;\n* Manufacturers @ [[Main Page|DIYAutoWiki Home]] &gt; [[:Category:Manufacturers|Manufacturers]] &gt;\n\n\n\n----\n\n\n<!-- Begin BMW -->\n* Bmw @ [[Main Page|DIYAutoWiki Home]] &gt; [[:Category:Manufacturers|Manufacturers]] &gt; [[:Category:Bmw|Bmw]] &gt;\n\n\n<!-- Begin BMW 1 Series & 2 Series -->\n* Bmw 1 series and 2 series @ [[Main Page|DIYAutoWiki Home]] &gt; [[:Category:Manufacturers|Manufacturers]] &gt; [[:Category:Bmw|BMW]] &gt; [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] &gt; \n\n* E81 (2007 - 2012) @ [[Main Page|DIYAutoWiki Home]] &gt; [[:Category:Manufacturers|Manufacturers]] &gt; [[:Category:Bmw|BMW]] &gt; [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] &gt;  [[:Category:E81 (2007 - 2012)|E81 (2007 - 2012)]] &gt;\n\n* E82 (2007 - 2013) @ [[Main Page|DIYAutoWiki Home]] &gt; [[:Category:Manufacturers|Manufacturers]] &gt; [[:Category:Bmw|BMW]] &gt; [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] &gt;  [[:Category:E82 (2007 - 2013)|E82 (2007 - 2013)]] &gt;\n\n* E87 (2004 - 2011) @ [[Main Page|DIYAutoWiki Home]] &gt; [[:Category:Manufacturers|Manufacturers]] &gt; [[:Category:Bmw|BMW]] &gt; [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] &gt; [[:Category:E87 (2004 - 2011)|E87 (2004 - 2011)]] &gt;\n\n* E88 (2007 - 2013) @ [[Main Page|DIYAutoWiki Home]] &gt; [[:Category:Manufacturers|Manufacturers]] &gt; [[:Category:Bmw|BMW]] &gt; [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] &gt; [[:Category:E88 (2007 - 2013)|E88 (2007 - 2013)]] &gt;\n\n* F20 (2011 - 2014) @ [[Main Page|DIYAutoWiki Home]] &gt; [[:Category:Manufacturers|Manufacturers]] &gt; [[:Category:Bmw|BMW]] &gt; [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] &gt; [[:Category:F20 (2011 - 2014)|F20 (2011 - 2014)]] &gt;\n\n* F22 (2014) @ [[Main Page|DIYAutoWiki Home]] &gt; [[:Category:Manufacturers|Manufacturers]] &gt; [[:Category:Bmw|BMW]] &gt; [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] &gt; [[:Category:F22 (2014)|F22 (2014)]] &gt;\n\n* F23 (2014) @ [[Main Page|DIYAutoWiki Home]] &gt; [[:Category:Manufacturers|Manufacturers]] &gt; [[:Category:Bmw|BMW]] &gt; [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] &gt; [[:Category:F23 (2014)|F23 (2014)]] &gt;\n<!-- End BMW 1 Series & 2 Series -->\n"}}
kostajh added a comment.EditedMar 6 2019, 3:57 PM

Minimal reproduction case, write this in wikitext then switch to VE in a new flow topic (preferably in a sandbox):

<syntaxhighlight lang="xml"> <!-- test= & tests--> </syntaxhighlight>

This will post to visualeditor API endpoint and return:

{"visualeditor":{"result":"success","content":"<div class=\"mw-highlight mw-content-ltr\" dir=\"ltr\" typeof=\"mw:Extension/syntaxhighlight\" about=\"#mwt3\" data-mw='{\"name\":\"syntaxhighlight\",\"attrs\":{\"lang\":\"xml\"},\"body\":{\"extsrc\":\"\\n&lt;!-- test &amp; tests -->\\n\"}}' id=\"mwAQ\"><pre><span></span><span class=\"c\">&lt;!-- test &amp; tests --></span>\n</pre></div><span about=\"#mwt3\">\n</span>"}}

When you click "Save post", ntcontent is set to:

<p+data-parsoid="{&quot;dsr&quot;:[0,4,0,0]}">test</p> <div+typeof="mw:Extension/syntaxhighlight"+data-mw="{&quot;name&quot;:&quot;syntaxhighlight&quot;,&quot;attrs&quot;:{&quot;lang&quot;:&quot;xml&quot;},&quot;body&quot;:{&quot;extsrc&quot;:&quot;\n<!--+test+&amp;+tests+-->\n&quot;}}"+class="mw-highlight+mw-content-ltr"+dir="ltr"+about="#mwt3"+data-parsoid="{&quot;dsr&quot;:[6,74,28,18]}"></div> <p+data-parsoid="{&quot;dsr&quot;:[75,88,0,0]}"><span+data-parsoid="{&quot;stx&quot;:&quot;html&quot;,&quot;dsr&quot;:[75,88,6,7]}"></span></p> <link+rel="mw:PageProp/Category"+href="./Category:Pages_with_syntax_highlighting_errors"+data-parsoid="{&quot;stx&quot;:&quot;simple&quot;,&quot;a&quot;:{&quot;href&quot;:&quot;./Category:Pages_with_syntax_highlighting_errors&quot;},&quot;sa&quot;:{&quot;href&quot;:&quot;Category:Pages+with+syntax+highlighting+errors&quot;},&quot;dsr&quot;:[89,139,null,null]}">

And wikitext exception is thrown.

This should not be a train blocker IMO, see this logstash entry from 2/27 on wmf.19.

I'm still investigating and will continue working on this issue, but I don't think it should hold up the train.

hashar added a comment.Mar 6 2019, 7:51 PM

@kostajh yeah it seems to be a very narrow corner case. I am dropping it from the list of blockers.

Thank you to have stepped in so fast and for the analysis!

This is the consequence of a batshit insane bug in PHP's DOMDocument. It'll drop all entity encoding if your attribute value contains something that looks like an HTML comment:

> function roundTrip($html) { $dom = new DOMDocument(); $dom->loadHTML($html); return $dom->saveHTML($dom->getElementsByTagName('html')->item(0)); }

> echo roundTrip('<body><p data-foo="&lt;!- foo&amp;bar --&gt;">Foo&amp;Bar</p></body>');
<html><body><p data-foo="&lt;!- foo&amp;bar --&gt;">Foo&amp;Bar</p></body></html>

> echo roundTrip('<body><p data-foo="&lt;!-- foo&amp;bar --&gt;">Foo&amp;Bar</p></body>');
<html><body><p data-foo="<!-- foo&bar -->">Foo&amp;Bar</p></body></html>

And this then breaks reparsing:

> echo roundTrip(roundTrip('<body><p data-foo="&lt;!- foo&amp;bar --&gt;">Foo&amp;Bar</p></body>'));
<html><body><p data-foo="&lt;!- foo&amp;bar --&gt;">Foo&amp;Bar</p></body></html>

> echo roundTrip(roundTrip('<body><p data-foo="&lt;!-- foo&amp;bar --&gt;">Foo&amp;Bar</p></body>'));
PHP Warning:  DOMDocument::loadHTML(): htmlParseEntityRef: expecting ';' in Entity, line: 1 in /vagrant/mediawiki/maintenance/eval.php(78) : eval()'d code on line 1
PHP Stack trace:
PHP   1. {main}() /var/www/w/MWScript.php:0
PHP   2. require_once() /var/www/w/MWScript.php:98
PHP   3. eval() /vagrant/mediawiki/maintenance/eval.php:78
PHP   4. roundTrip() /vagrant/mediawiki/maintenance/eval.php(78) : eval()'d code:1
PHP   5. DOMDocument->loadHTML() /vagrant/mediawiki/maintenance/eval.php(78) : eval()'d code:1

Warning: DOMDocument::loadHTML(): htmlParseEntityRef: expecting ';' in Entity, line: 1 in /vagrant/mediawiki/maintenance/eval.php(78) : eval()'d code on line 1

Call Stack:
    0.0003     363696   1. {main}() /var/www/w/MWScript.php:0
    0.0070     419728   2. require_once('/vagrant/mediawiki/maintenance/eval.php') /var/www/w/MWScript.php:98
 2839.9539   17501680   3. eval('echo roundTrip(roundTrip('<body><p data-foo="&lt;!-- foo&amp;bar --&gt;">Foo&amp;Bar</p></body>'));;') /vagrant/mediawiki/maintenance/eval.php:78
 2839.9542   17501792   4. roundTrip() /vagrant/mediawiki/maintenance/eval.php(78) : eval()'d code:1
 2839.9543   17501904   5. DOMDocument->loadHTML() /vagrant/mediawiki/maintenance/eval.php(78) : eval()'d code:1

<html><body><p data-foo="<!-- foo&bar -->">Foo&amp;Bar</p></body></html>

This is the consequence of a batshit insane bug in PHP's DOMDocument. It'll drop all entity encoding if your attribute value contains something that looks like an HTML comment:

You might be interested in T215000: Fill gaps in PHP DOM's functionality ... just in case the work that we do there can benefit you in some way.

It appears that we might be able to fix this by using $dom->saveXML( $node, LIBXML_NOEMPTYTAG ) instead of $dom->saveHTML( $node ). I'll try writing a script that tests this on all HTML documents we have in the production Flow DB to see if there are any unexpected differences this causes.

One difference is that the data-parsoid attribute gets single-quoted by saveHTML (because it detects the double quotes in the value) but not by saveXML. The fact that we'd be missing out of space savings is a little sad (though worth it if it fixes a data corruption bug like this one), but it makes regular diffs between the output of saveXML and saveHTML useless, because these data-parsoid differences happen all over the place. I'll try a different approach instead.

Instead of comparing the output of saveHTML and saveXML directly, I decided to check whether their output was semantically equivalent.

function roundTripHtml($html) { $dom = Flow\Parsoid\ContentFixer::createDOM($html); return $dom->saveHTML($dom->getElementsByTagName('body')->item(0)); }
function roundTripXml($html) { $dom = Flow\Parsoid\ContentFixer::createDOM($html); return $dom->saveXML($dom->getElementsByTagName('body')->item(0)); }

foreach ( $flowRevisions as $content ) {
    assert ( roundTripHtml(roundTripXml($contents)) === roundTripHtml($contents) );
}

And this assertion passes for all 227k Flow revisions (that aren't wikitext or topic-title-wikitext) on mediawikiwiki.

I'll work on a patch to switch from saveHTML to saveXML tomorrow. We'll likely also have to reserialize old content, but we have to do that anyway for T209120: Upgrade Parsoid HTML stored in the StructuredDiscussions tables so I'll probably roll it in with that.

cscott added a subscriber: cscott.Mar 7 2019, 3:16 PM

@Catrope also look at the HTMLFormatter library; the mobile team already have figured out a bunch of weird workarounds for PHP's DOM bugs.

In an ideal world, we'd all switch to using Remex and a proper spec-compliant DOM library, but at the very least we could make an effort to share code & workarounds so we don't have to independently rediscover them. (T215000 has a bunch of weird PHP DOM "quirks" documented now, sigh.)

This is very useful, thank you! The HTMLFormatter library does not exhibit this bug, though I don't yet understand why, because it also uses DOMDocument.

This is very useful, thank you! The HTMLFormatter library does not exhibit this bug, though I don't yet understand why, because it also uses DOMDocument.

...that would be because it short-circuits to using the input string when you don't ask for any manipulation. When you actually do try to do manipulation, HTMLFormatter is broken in much worse ways:

> function roundTripDom($html) { $dom = Flow\Parsoid\ContentFixer::createDOM($html); return $dom->saveHTML($dom->getElementsByTagName('body')->item(0)); }
> function roundTripFormatter($html) { $formatter = new HtmlFormatter\HtmlFormatter(HtmlFormatter\HtmlFormatter::wrapHTML($html)); $formatter->getDoc(); return $formatter->getText(); }

> $s =  '<body><p data-foo="&quot;&lt;!- foo&amp;bar --&gt;">Foo&amp;Bar</p></body>'; echo roundTripDom($s) . "\n" . roundTripFormatter($s);
<body><p data-foo='"&lt;!- foo&amp;bar --&gt;'>Foo&amp;Bar</p></body>
<p data-foo='"&lt;!- foo&amp;bar --&gt;'>Foo&amp;Bar</p>

> $s =  '<body><p data-foo="&quot;&lt;!-- foo&amp;bar --&gt;">Foo&amp;Bar</p></body>'; echo roundTripDom($s) . "\n" . roundTripFormatter($s);
<body><p data-foo='"<!-- foo&bar -->'>Foo&amp;Bar</p></body>
<p data-foo='"'>Foo&amp;Bar</p>

All this speaks for us all to consolidate behind a common solution for HTML parsing and DOM manipulation on the PHP side. That could be Remex + the DOMCompat code that is currently in review ( https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/491892 ). One of the original suggestions @cscott made was to release this dom compat code as a composer lib but we just went with it being a Parsoid-internal compat layer for simplicity. But, if there is broader immediate interest, perhaps we could make it a composer lib which everyone can build upon.

cscott added a comment.Mar 7 2019, 9:28 PM

The HTMLFormatter library itself is pretty general purpose and might be the basis of a more general lib, especially if it were updated (eventually) to use Remex and Zest (T217360: Replace libxml/xpath in HtmlFormatter with Remex/zest). I think Remex still has a performance problem to fix (T212543: RemexHtml DOM construction performance increases non-linearly wrt HTML size) which is probably going to prevent it from replacing DOMDocument::loadHTML in the immediate short term (sigh).

Having a library would be great. For now, I'm going to proceed with changing saveHTML() to saveDOM(), because that's something that I've already confirmed results in no semantic change on the Flow HTML corpus. Longer term, we can port Flow's code to Remex or a new shared library, once the kinks get worked out there.

cscott added a comment.Mar 7 2019, 9:41 PM

We've got our own XMLSerializer for Parsoid, that is probably part of the long-term solution.

You still have to be a little bit careful about the content of comments to avoid the string --> appearing in them (that clever encoding algorithm is in VE and Parsoid), and if you serialize as XML you have to be sure to parse as XML as well in order to avoid missing newlines at the start of <pre> tags, alas. (See the warnings and examples at https://html.spec.whatwg.org/multipage/parsing.html#serialising-html-fragments ).

Change 495139 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/extensions/Flow@master] Conversion\Utils: Work around DOMDocument serialization bug

https://gerrit.wikimedia.org/r/495139

Because @cscott mentioned pre newlines in <pre>s, I verified that these are treated the same by both serialization methods:

> $s = "<pre>Foo\nBar\n</pre>"; echo roundTripDom($s) ."\n" . roundTripFormatter($s);
<body><pre>Foo
Bar
</pre></body>
<pre>Foo
Bar
</pre>
Tgr added a subscriber: Tgr.EditedMar 7 2019, 10:34 PM

XMLSerializer is a far bit slower than native serialization. On a vagrant box, with the Obama article: DOMDocument::saveHTML: 0.032903, DOMDocument::saveXML: 0.015279, XMLSerializer::serialize: 0.445519 second.

Tgr added a comment.Mar 7 2019, 10:38 PM

Because @cscott mentioned pre newlines in <pre>s, I verified that these are treated the same by both serialization methods:

The difference is with leading newlines (which according to the HTML spec should be ignored, so they need to be doubled during serialization to roundtrip correctly).

Oh whoops, I did in fact test that but then copy-pasted the wrong test into my phab comment. Here's the right one:

> $s = "<pre>\nFoo\nBar\n</pre>"; echo roundTripDom($s) ."\n" . roundTripFormatter($s);
<body><pre>
Foo
Bar
</pre></body>
<pre>
Foo
Bar
</pre>

> $s = "<pre>\n\nFoo\nBar\n</pre>"; echo roundTripDom($s) ."\n" . roundTripFormatter($s);
<body><pre>

Foo
Bar
</pre></body>
<pre>

Foo
Bar
</pre>
cscott added a comment.EditedMar 7 2019, 10:40 PM

@Catrope: yes, what @Tgr said -- your test case should be "<pre>\n\nfoo\n\nbar\n\n</pre>". (The middle and trailing ones are just for completeness, it's really the leading newlines that would be mangled.)

Looks like you got it. Probably PHP doesn't bother to be HTML-spec-compliant with stripping the leading newline. (Which is kinda broken spec behavior anyway, to be fair.)

Catrope added a comment.EditedMar 8 2019, 2:06 AM

Ugh, it turns out that loadXML() also changes tags that should be self-closing to tags that aren't. The test failures for my patch show changes like <img src="foo"> to <img src="foo"></img>, <meta typeof="mw:Placeholder"> to <meta typeof="mw:Placeholder"></meta>, etc. I might have to improvise a PHP version of this hack to make it do what I want, unless these tags don't cause problems.

hashar removed a subscriber: hashar.Mar 8 2019, 9:32 AM
hashar added a comment.Mar 8 2019, 9:35 AM

I am unsubscribing / muting this task since I know nothing about PHP DOM and don't feel I can help. I merely just reported an error I have spotted in the logs.

function roundTrip($html) {

    $dom = new DOMDocument();
    $dom->loadHTML($html);

    return $dom->saveHTML(
        $dom->getElementsByTagName('html')->item(0)
    );
}

That is a nice trick :)

On a side note: fascinatingly, the content of this particular comment is not actually corrupted in the database. The error must have happened because the Flow code parsed it, serialized it, then parsed it again. @kostajh had said that we'd need to write code to handle stored content with badly encoded comments, but I suppose that would only apply to comments saved in the past few weeks. There might not be any that have encoding errors (I'll look into that next).

I used redirect=no to look at the LQT page, which has separate links to each post. I used the API to get the contents of each post, until I hit an error. That told me that the post ID of the broken post is s96a16f3z33jcx4e. From there:

catrope@mwmaint1002:~$ mwscript eval.php mediawikiwiki
> echo Flow\Model\UUID::create('s96a16f3z33jcx4e')->getHex()
052aba864a50524bdf01ae


catrope@stat1006:~$ analytics-mysql flowdb --use-x1
mysql:research@dbstore1005.eqiad.wmnet [flowdb]> select * from flow_revision where rev_id=unhex('052aba864a50524bdf01ae') \G
...
                  rev_flags: utf-8,gzip,html,external
                rev_content: DB://cluster25/701165
...

catrope@mwmaint1002:~$ mwscript eval.php mediawikiwiki
> echo gzinflate(ExternalStore::fetchFromURL('DB://cluster25/701165'));
<body data-parsoid='{"dsr":[0,2967,0,0]}' lang="en" class="mw-content-ltr sitedir-ltr ltr mw-body mw-body-content mediawiki" dir="ltr"><p data-parsoid='{"dsr":[0,90,0,0]}'>It's close to working, however not fully. I must be messing something up with the Syntax. </p>

<p data-parsoid='{"dsr":[92,317,0,0]}'>It should read = DIYAutoWiki Home > Manufacturers > BMW > BMW 1 Series and 2 Series > E81 (2007 - 2012)
<br data-parsoid='{"stx":"html","selfClose":true,"dsr":[196,202,6,0]}'/>
Instead it's reading = DIYAutoWiki Home > Manufacturers > BMW > BMW 1 Series and 2 Series >* E81 (2007 - 2012) E88</p>

<p data-parsoid='{"dsr":[319,448,0,0]}'>I have no clue why the asterisk and the "E88" are showing up. This has to be an error in my formatting and I can't figure it out.</p>

<p data-parsoid='{"dsr":[450,544,0,0]}'>Here is how I have it setup. Hopefully a fresh set of eyes can tell me where I am going wrong:</p>

<p data-parsoid='{"dsr":[546,546,0,0]}'><br data-parsoid='{"dsr":[546,546,0,0]}'/></p>
<div class="mw-highlight mw-content-ltr" dir="ltr" typeof="mw:Extension/syntaxhighlight" data-mw='{"name":"syntaxhighlight","attrs":{"source":"","lang":"xml"},"body":{"extsrc":"\n* Main Page @\n* default   @ [[Main Page|DIYAutoWiki Home]] >\n* Manufacturers @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] >\n\n\n\n----\n\n\n&lt;!-- Begin BMW -->\n* Bmw @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|Bmw]] >\n\n\n&lt;!-- Begin BMW 1 Series &amp; 2 Series -->\n* Bmw 1 series and 2 series @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > \n\n* E81 (2007 - 2012) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] >  [[:Category:E81 (2007 - 2012)|E81 (2007 - 2012)]] >\n\n* E82 (2007 - 2013) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] >  [[:Category:E82 (2007 - 2013)|E82 (2007 - 2013)]] >\n\n* E87 (2004 - 2011) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:E87 (2004 - 2011)|E87 (2004 - 2011)]] >\n\n* E88 (2007 - 2013) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:E88 (2007 - 2013)|E88 (2007 - 2013)]] >\n\n* F20 (2011 - 2014) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:F20 (2011 - 2014)|F20 (2011 - 2014)]] >\n\n* F22 (2014) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:F22 (2014)|F22 (2014)]] >\n\n* F23 (2014) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:F23 (2014)|F23 (2014)]] >\n&lt;!-- End BMW 1 Series &amp; 2 Series -->\n"}}' data-parsoid='{"dsr":[547,2841,2,2]}' about="#mwt3">
<pre>* Main Page @
* default   @ [[Main Page|DIYAutoWiki Home]] >
* Manufacturers @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] >



----


<span class="c">&lt;!-- Begin BMW --></span>
* Bmw @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|Bmw]] >


<span class="c">&lt;!-- Begin BMW 1 Series &amp; 2 Series --></span>
* Bmw 1 series and 2 series @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > 

* E81 (2007 - 2012) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] >  [[:Category:E81 (2007 - 2012)|E81 (2007 - 2012)]] >

* E82 (2007 - 2013) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] >  [[:Category:E82 (2007 - 2013)|E82 (2007 - 2013)]] >

* E87 (2004 - 2011) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:E87 (2004 - 2011)|E87 (2004 - 2011)]] >

* E88 (2007 - 2013) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:E88 (2007 - 2013)|E88 (2007 - 2013)]] >

* F20 (2011 - 2014) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:F20 (2011 - 2014)|F20 (2011 - 2014)]] >

* F22 (2014) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:F22 (2014)|F22 (2014)]] >

* F23 (2014) @ [[Main Page|DIYAutoWiki Home]] > [[:Category:Manufacturers|Manufacturers]] > [[:Category:Bmw|BMW]] > [[:Category:Bmw 1 series and 2 series|BMW 1 Series and 2 Series]] > [[:Category:F23 (2014)|F23 (2014)]] >
<span class="c">&lt;!-- End BMW 1 Series &amp; 2 Series --></span>
</pre></div>

<p data-parsoid='{"dsr":[2843,2967,0,0]}'>Sorry if this wasn't the best way to format this. You don't want to know how long it took me to accomplish posting this lol.</p></body>

Note that the data-mw attribute on <div class="mw-highlight contains correctly encoded comments, like &lt;!-- Begin BMW 1 Series &amp; 2 Series --> and &lt;!-- End BMW 1 Series &amp; 2 Series -->

From LogStash, it looks like the only other URL with a WikitextException is mediawiki.org/wiki/Topic:Uitnkqacnfjiriga, where the HTML is

<html><head></head><body>
<p data-parsoid='{"dsr":[0,176,0,0]}'>Solution that <a rel="mw:ExtLink" class="external text" href="http://avantec.se/2015/02/10/howto-make-iis-allow-colon-sign-in-request-url/" data-parsoid='{"targetOff":92,"contentOffsets":[92,110],"dsr":[14,111,78,1]}'>someone else found</a> for the problem: Just above <code data-parsoid='{"stx":"html","dsr":[140,171,6,7]}'>&lt;system.webServer&gt;</code> put:</p>
<div class="mw-highlight mw-content-ltr" dir="ltr" typeof="mw:Extension/source" about="#mwt3" data-parsoid='{"dsr":[177,444,2,2]}' data-mw='{"name":"source","attrs":{"lang":"xml"},"body":{"extsrc":"\n  ...\n  &lt;system.web&gt;\n    <!-- Default <,>,*,%,&,:,,? or %u003c,%u003e,%u002a,%u0025,%u0026,%u003a,%u005c,%u003f -->\n    &lt;httpRuntime requestPathInvalidCharacters=\"%u003c,%u003e,%u002a,%u0025,%u0026,%u005c,%u003f\" /&gt;\n  &lt;/system.web&gt;\n  ...\n"}}'><pre><span></span>  ...
  <span class="nt">&lt;system.web&gt;</span>
    <span class="c">&lt;!-- Default &lt;,&gt;,*,%,&amp;,:,,? or %u003c,%u003e,%u002a,%u0025,%u0026,%u003a,%u005c,%u003f --&gt;</span>
    <span class="nt">&lt;httpRuntime</span> <span class="na">requestPathInvalidCharacters=</span><span class="s">"%u003c,%u003e,%u002a,%u0025,%u0026,%u005c,%u003f"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/system.web&gt;</span>
  ...
</pre></div>
<span about="#mwt3">
</span>
</body></html>

Moving back to ready for development for this bit:

we'd need to write code to handle stored content with badly encoded comments, but I suppose that would only apply to comments saved in the past few weeks. There might not be any that have encoding errors (I'll look into that next).

kostajh removed kostajh as the assignee of this task.Mar 12 2019, 2:23 AM
kostajh removed a project: Patch-For-Review.

Change 495139 merged by jenkins-bot:
[mediawiki/extensions/Flow@master] Conversion\Utils: Work around DOMDocument serialization bug

https://gerrit.wikimedia.org/r/495139

@Catrope Actually, looks like we don't need to do anything further here. Both URLS (mediawiki.org/wiki/Topic:Uitnkqacnfjiriga and https://www.mediawiki.org/wiki/Thread:Project:Support_desk/Extension:_BreadCrumbs_2_troubleshooting load without exceptions), and inputting content like this in Flow works fine now too:

<syntaxhighlight lang="xml">
<!-- test & tests-->
</syntaxhighlight>