It's been a while since we've broken down where time is spent when parsing an article (e.g. when saving edits, or on page view if it was purged, recently modified, and on the first view after a cache expiry).
I've recorded an XHGui profile when copying a revision of Barack_Obama to Wikipedia:Sandbox. Below is a breakdown of POST request that saved the edit and the post-redirect GET request that displays the page after the page is saved.
Configuration:
- backend=mwdebug1001.eqiad.wmnet; profile; php7
Based on performing similar actions on non-debug severs, I estimate the numbers are about 2-3X inflated (e.g. due to being "colder" and having lesser hardware resources).
General observations
Firstly, I noticed that both requests parsed the article, which was unexpected given that the submit action is meant to populate this for the ParserCache to re-use on the page view. I'll leave investigation of this for another task. It's possible this generally works but failed due something about the debug server and/or my method of capturing the time profile.
Secondly, I noticed that even within the submit action, Parser was invoked twice for the article:
- 2x Parser::parse
- 2x WikitextContent::fillParserOutput
- 1x MediaWiki\Revision\RenderedRevision::getSlotParserOutputUncached (expected)
- 1x PageImages\Hooks\LinksUpdateHookHandler::getPageImageCandidates (unexpected)
- 2x WikitextContent::fillParserOutput
I'll file a separate task about that as well. But this means that for req 1, the numbers are additionally inflated by ~ 2X.
Req 1: POST Submit action
https://performance.wikimedia.org/xhgui/run/view?id=5dc4a7fc3f3dfa1d753f4f7a
- Total wall-clock time: 39 seconds.
- Pre-send: 25 seconds
- fillParserOutput: 21.1 seconds (total of being called twice)
- Post-send: 14 seconds
- DeferredUpdates::run: 14.0 seconds.
- Pre-send: 25 seconds
Req 2: GET Page view
https://performance.wikimedia.org/xhgui/run/view?id=5dc4a8013f3dfa2375668498
Total wall-clock time: 19 seconds.
- Pre-send: 18.3 seconds
- fillParserOutput: 18.2 seconds (called once)
- Post-send: 0.7 seconds
- DeferredUpdates::doUpdates: 0.17 second
Breakdown of Parse Time
This is based on the above three invocations. There are no significant differences between the three invocations as far as I know. There is some amount of caching for revision text and lower-level APCu-cached computations that may be re-used between some of them. For simplicity and transparency I'm keeping the timings separate but the overall distribution is similar between them.
Method | % | Inclusive wall time (Submit action, Page view) |
---|---|---|
WikitextContent::fillParserOutput | 100% | 21.1 s, 18.2 s |
⤷ Parser::parse | 100% | 21,132 ms, 18,241 ms |
… ⤷ Parser::internalParse | 83-85% | 18,038 ms, 15,830 ms |
… ⤷ Parser::internalParseHalfParsed | 12-13% | 2,660 ms, 2,395 ms |
Given the Parser involves recursion and lots of indirection, I'm breaking apart from the parent-child hierarchy below. Instead, listing directly several internal methods which have a lot of time spent in them. Hand-picked from the XHGui profiles linked above, after sorting by Inclusive wall time:
Method | % | Inclusive wall time (Submit action) | Call count |
---|---|---|---|
WikitextContent::fillParserOutput | 100% | 21.1 s | 1 |
… ScribuntoHooks::invokeHook | 51% | 10,841 ms | ~ 700x |
… Wikimedia\Rdbms\DatabaseMysqli::doQuery | 43% | 9,120 ms | 2567x |
… Parser::handleInternalLinks | 11% | 2,319 ms | ~ 580x |
… RemexDriver::tidy | 10% | 2,183 ms | 1 |
… Parser::statelessFetchTemplate | 7.8% | 1,656 ms | ~ 300x |
… Sanitizer::removeHTMLtags | 6.5% | 1,390 ms | ~ 500x |
… MemcachedPeclBagOStuff::doGetMulti | 5.9% | 1,241 ms | 1636x |
… Scribunto_LuaUstringLibrary::ustringMatch | 5.3% | 1,125 ms | ~ 28,800x |
… LinkHolderArray::replace | 4.7% | 1,050 ms | ~ 4 |
… Wikimedia\Rdbms\LoadBalancer::getForeignConnection | 4.2% | 884 ms | 378x |
… Wikimedia\Rdbms\LoadBalancer::reallyOpenConnection | 4.1% | 872 ms | 361x |
Method | % | Inclusive wall time (Page view) | Call count |
---|---|---|---|
WikitextContent::fillParserOutput | 100% | 18.2 s | 1 |
… ScribuntoHooks::invokeHook | 48% | 8,917 ms | ~ 700x |
… Parser::handleInternalLinks | 11% | 2,017 ms | ~ 580x |
… RemexDriver::tidy | 11% | 1,978 ms | 1 |
… Sanitizer::removeHTMLtags | 6.9% | 1,272 ms | ~ 500x |
… Parser::statelessFetchTemplate | 6.2% | 1,141 ms | ~ 300x |
… Scribunto_LuaUstringLibrary::ustringMatch | 5.9% | 1,076 ms | ~ 28,800x |
… MemcachedPeclBagOStuff::doGetMulti | 5.6% | 1,018 ms | 1374x |
… LinkHolderArray::replace | 3.2% | 607 ms | ~ 4 |
… Wikimedia\Rdbms\DatabaseMysqli::doQuery | 2.0% | 375 ms | 358x |
… Wikimedia\Rdbms\LoadBalancer::getForeignConnection | 0.03% | 7 ms | 25x |
… Wikimedia\Rdbms\LoadBalancer::reallyOpenConnection | 0.2% | 44 ms | 9x |