In exception.log we're getting a dozen or so exceptions a day from Parsoid/Utils.php complaining about "htmlParseEntityRef: no name" when it tries to createDOM()
I think most or all of the calls are from Parsoid/Redlinker->apply() . The exception originates from various URLs: ?action=history on a Flow board, Special:Watchlist, Special:RecentChanges, also Special:RecentChangesLinked.
What's confusing me is:
a) Some of these URLs are not even Flow boards, e.g. the following log entry is from
https://www.mediawiki.org/wiki/Special:RecentChangesLinked/Wikimedia_engineering_report/2013/December/summary
Visiting this link generates an exception every time.
b) Do we need to parse redlinks when generating formatter lines?
I don't think any of these manifest something visible to the user, I guess the change line with the exception is skipped.
Sample exception.log entry follows. Many of the 115 or so exceptions in May and June have this stack trace
/wiki/Special:RecentChangesLinked/Wikimedia_engineering_report/2013/December/summary Exception from line 198 of /usr/local/apache/common-local/php-1.24wmf8/extensions/Flow/includes/Parsoid/Utils.php: htmlParseEntityRef: no name
2014-06-11 22:03:30 mw1029 mediawikiwiki: [ffced3b6] /wiki/Special:RecentChangesLinked/Wikimedia_engineering_report/2013/December/summary Exception from line 198 of /usr/local/apache/common-local/php-1.24wmf8/extensions/Flow/includes/Parsoid/Utils.php: htmlParseEntityRef: no name
htmlParseStartTag: invalid element name
#0 /usr/local/apache/common-local/php-1.24wmf8/extensions/Flow/includes/Parsoid/Redlinker.php(156): Flow\Parsoid\Utils::createDOM('<?xml encoding=...')
#1 /usr/local/apache/common-local/php-1.24wmf8/extensions/Flow/includes/Parsoid/Controller.php(65): Flow\Parsoid\Redlinker->apply('Topic title <br...', Object(Title))
#2 /usr/local/apache/common-local/php-1.24wmf8/extensions/Flow/includes/Parsoid/Controller.php(52): Flow\Parsoid\Controller->apply('Topic title <br...', Object(Title))
#3 /usr/local/apache/common-local/php-1.24wmf8/extensions/Flow/includes/Templating.php(462): Flow\Parsoid\Controller->getContent(Object(Flow\Model\PostRevision))
#4 /usr/local/apache/common-local/php-1.24wmf8/extensions/Flow/includes/Formatter/RevisionFormatter.php(83): Flow\Templating->getContent(Object(Flow\Model\PostRevision))
#5 /usr/local/apache/common-local/php-1.24wmf8/extensions/Flow/includes/Formatter/RecentChanges.php(26): Flow\Formatter\RevisionFormatter->formatApi(Object(Flow\Formatter\RecentChangesRow), Object(OldChangesList))
#6 /usr/local/apache/common-local/php-1.24wmf8/extensions/Flow/Hooks.php(194): Flow\Formatter\RecentChanges->format(Object(Flow\Formatter\RecentChangesRow), Object(OldChangesList))
#7 [internal function]: FlowHooks::onOldChangesListRecentChangesLine(Object(OldChangesList), '(<a href="/w/in...', Object(RecentChange), Array)
...
The problem with the entity refs has been fixed, but we should not be running this code in this case anyways. We could likely add another flag to the Serializer to indicate if we want the content or not.
Version: master
Severity: normal