Page MenuHomePhabricator

Performance: Post content is processed for recentchanges but never used.
Closed, ResolvedPublic

Description

In exception.log we're getting a dozen or so exceptions a day from Parsoid/Utils.php complaining about "htmlParseEntityRef: no name" when it tries to createDOM()

I think most or all of the calls are from Parsoid/Redlinker->apply() . The exception originates from various URLs: ?action=history on a Flow board, Special:Watchlist, Special:RecentChanges, also Special:RecentChangesLinked.

What's confusing me is:

a) Some of these URLs are not even Flow boards, e.g. the following log entry is from
https://www.mediawiki.org/wiki/Special:RecentChangesLinked/Wikimedia_engineering_report/2013/December/summary
Visiting this link generates an exception every time.
b) Do we need to parse redlinks when generating formatter lines?

I don't think any of these manifest something visible to the user, I guess the change line with the exception is skipped.

Sample exception.log entry follows. Many of the 115 or so exceptions in May and June have this stack trace

/wiki/Special:RecentChangesLinked/Wikimedia_engineering_report/2013/December/summary Exception from line 198 of /usr/local/apache/common-local/php-1.24wmf8/extensions/Flow/includes/Parsoid/Utils.php: htmlParseEntityRef: no name

2014-06-11 22:03:30 mw1029 mediawikiwiki: [ffced3b6] /wiki/Special:RecentChangesLinked/Wikimedia_engineering_report/2013/December/summary Exception from line 198 of /usr/local/apache/common-local/php-1.24wmf8/extensions/Flow/includes/Parsoid/Utils.php: htmlParseEntityRef: no name

htmlParseStartTag: invalid element name

#0 /usr/local/apache/common-local/php-1.24wmf8/extensions/Flow/includes/Parsoid/Redlinker.php(156): Flow\Parsoid\Utils::createDOM('<?xml encoding=...')
#1 /usr/local/apache/common-local/php-1.24wmf8/extensions/Flow/includes/Parsoid/Controller.php(65): Flow\Parsoid\Redlinker->apply('Topic title <br...', Object(Title))
#2 /usr/local/apache/common-local/php-1.24wmf8/extensions/Flow/includes/Parsoid/Controller.php(52): Flow\Parsoid\Controller->apply('Topic title <br...', Object(Title))
#3 /usr/local/apache/common-local/php-1.24wmf8/extensions/Flow/includes/Templating.php(462): Flow\Parsoid\Controller->getContent(Object(Flow\Model\PostRevision))
#4 /usr/local/apache/common-local/php-1.24wmf8/extensions/Flow/includes/Formatter/RevisionFormatter.php(83): Flow\Templating->getContent(Object(Flow\Model\PostRevision))
#5 /usr/local/apache/common-local/php-1.24wmf8/extensions/Flow/includes/Formatter/RecentChanges.php(26): Flow\Formatter\RevisionFormatter->formatApi(Object(Flow\Formatter\RecentChangesRow), Object(OldChangesList))
#6 /usr/local/apache/common-local/php-1.24wmf8/extensions/Flow/Hooks.php(194): Flow\Formatter\RecentChanges->format(Object(Flow\Formatter\RecentChangesRow), Object(OldChangesList))
#7 [internal function]: FlowHooks::onOldChangesListRecentChangesLine(Object(OldChangesList), '(<a href="/w/in...', Object(RecentChange), Array)
...

The problem with the entity refs has been fixed, but we should not be running this code in this case anyways. We could likely add another flag to the Serializer to indicate if we want the content or not.


Version: master
Severity: normal

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:20 AM
bzimport set Reference to bz66505.
bzimport added a subscriber: Unknown Object (MLST).
EBernhardson renamed this task from Flow Parsoid/Utils.php exceptions: "htmlParseEntityRef: no name" in createDom() for RedLInker to Performance: Post content is processed for recentchanges but never used. .Dec 11 2014, 3:19 AM
EBernhardson triaged this task as Low priority.
EBernhardson updated the task description. (Show Details)
EBernhardson set Security to None.

Change 179060 had a related patch set uploaded (by EBernhardson):
Perf: Dont process post content when formatting recentchanges

https://gerrit.wikimedia.org/r/179060

Patch-For-Review

Change 179060 merged by jenkins-bot:
Perf: Dont process post content when formatting recentchanges

https://gerrit.wikimedia.org/r/179060