Page MenuHomePhabricator

Optimize DiscussionTools OutputPageBeforeHTML hook handler (2025)
Open, Needs TriagePublic

Description

Pardon me for asking another question here. I sometimes feel a bit annoyed by the time taken for the OutputPageBeforeHTML hook, as it can take more than 1s on some large talk pages (length alone might not be the problem, probably also related to the number of signatures), and the result is not cached. Is it inevitable by design, or already tracked somewhere in another ticket?

I don't think it can be cached, or at least not very effectively; the result depends on the current time (for the relative time display), and also on the user viewing it (for the state of topic subscriptions and presence of thanks buttons). A whole second is a lot slower than I would expect it to be though, even on large pages; I recorded a profile and there are some possible improvements.

Here's the profile: https://performance.wikimedia.org/excimer/profile/56e998919df45ca2, and screenshot zoomed into the DiscussionTools OutputPageBeforeHTML hook handler in case that page disappears:

image.png (1×1 px, 266 KB)

It indeed took a bit over a second. Two things strike me:

  • There is a separate HtmlHelper::modifyElements() pass inside postprocessTopicSubscription(). In commit 37856941cffb646067c3df1a7ef639ffe65a475c we've combined all other postprocessing into a single HtmlHelper::modifyElements() pass, but this one has to be done separately, since it prepares data for a single SQL query that fetches topic subscriptions status for every topic on the page. We don't use the output from this pass – maybe something can be done to reduce the overhead.
  • Inside the other HtmlHelper::modifyElements() pass, there are some very deep stack traces, which turn out to be parsing localisation messages repeatedly. I looked at the code and we already cache them in some places (e.g. 'discussiontools-replybutton' is only parsed once and then cached), but not others (e.g. 'thanks-button-thank').

Event Timeline

For reference, here are the flame graphs I took on two user talk pages that contain >400 topics:
https://performance.wikimedia.org/excimer/profile/44477ee150bc5d46 (1.27s)
https://performance.wikimedia.org/excimer/profile/f5cfc8e261afbef1 (1.49s)
These requests are sent without cookie, therefore does not contain the postprocessTopicSubscription() part. Only BatchModifyElements is involved.

Change #1189940 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/ReportIncident@master] Only parse DiscussionTools overflow menu message once per pageview

https://gerrit.wikimedia.org/r/1189940

Change #1189941 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/DiscussionTools@master] Only parse each overflow menu message once per pageview

https://gerrit.wikimedia.org/r/1189941

Change #1189942 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/DiscussionTools@master] Only parse each topic subscription message once per pageview

https://gerrit.wikimedia.org/r/1189942

Change #1189943 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/DiscussionTools@master] CommentFormatter: Avoid HTML serialization in postprocessTopicSubscription preprocessing

https://gerrit.wikimedia.org/r/1189943

I had a hard time measuring how much exactly this improves the situation… For some reason I was getting a lot of variance between pageviews. I don't want to spend a lot of time debugging my local setup for this, so I hope someone else will find out the numbers.

(OT) Wow, I really appreciate your diligence! Unfortunately I am not able to comment on the implementation details.

Copying my comment from code review:
this seems like complex setup to cache a message string, and I worry about having to copy this code to every new user of this hook. is there a way this could be provided by some DT class upstream?

One thing we could do it have the OverflowMenuItem return message keys. Then we can implement the caching in CommentFormatter.

(OT) Wow, I really appreciate your diligence!

Thanks :) I enjoy making performance improvements.

this seems like complex setup to cache a message string, and I worry about having to copy this code to every new user of this hook. is there a way this could be provided by some DT class upstream?
(…)
One thing we could do it have the OverflowMenuItem return message keys. Then we can implement the caching in CommentFormatter.

I didn't want to make breaking changes to the hook signature or the OverflowMenuItem class, but if avoiding that is not important, then we can probably do something that would be easier for the hook handlers.

I wouldn't want to limit it to only message keys, since then you give up the ability for these label messages to take parameters, but maybe we can allow either a message key or a MessageSpecifier and optimize the simple case.

Change #1189943 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@master] CommentFormatter: Avoid HTML serialization in postprocessTopicSubscription preprocessing

https://gerrit.wikimedia.org/r/1189943

I wouldn't want to limit it to only message keys, since then you give up the ability for these label messages to take parameters, but maybe we can allow either a message key or a MessageSpecifier and optimize the simple case.

Sounds good. The API is not used much at the moment so I'm ok with breaking changes to get it right.

Change #1189942 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@master] Only parse each topic subscription message once per pageview

https://gerrit.wikimedia.org/r/1189942

Change #1189941 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@master] Only parse each overflow menu message once per pageview, if possible

https://gerrit.wikimedia.org/r/1189941

It seems that the patches are not of much help for the two pages mentioned in T405135#11198828. Respectively:
https://performance.wikimedia.org/excimer/profile/52433f65c1598f86 (1.21s, ~unchanged)
https://performance.wikimedia.org/excimer/profile/5c7626da06afdb9e (1.64s, slower, but now there are a few more topics)

Yeah, that's a bit disappointing. There are still some messages that we parse repeatedly, because they take parameters that may vary (e.g. the "Latest comment: ..." subheadings). I suppose these could be cached too, depending on their parameters – it won't help on normal talk pages where the subheadings for each section are different, but it would help on pages like this where the majority of discussion topics have 1 comment, 1 person in discussion and were posted 1-15 years ago. The rest of the cost is just parsing and then serializing HTML, which we can't easily optimize.

Note also that these profiles look like logged-out page views, and the improvement should be more noticeable on logged-in page views (which display topic subscription and comment thanks buttons).

I wanted to compare the profiles for a logged-in page view too. Unfortunately I didn't write down which page revision I used for the profile before these changes, and they don't seem to record the timestamp.

I hope that these profiles I took just now: (1) (2) (3) (4) (of the first revision from 20 September) are comparable to this one I took when filing this task: (1) (I didn't take multiple samples at that time, which is also unfortunate). If so, the improvement for that page would be from ~1000ms spent in in DiscussionTools postprocessing to ~800ms. I'm afraid I don't have more believable numbers, I could have been more scientific about measuring this.

Yes, those requests are sent using curl without cookies, because I don't want to profile lots of load.php/api.php stuffs (with browser cache disabled), or hit the OutputPage::checkLastModified path due to the presence of If-Modified-Since. I might be weird in this regard.

From the profiling samples you posted, the time spent in the anonymous function BatchModifyElements.php(38) is roughly cut by 200-300ms, which should be quite good. The largest saving comes from the mw:dt-ellipsisbutton part (the overflow menu). It is not enabled on zhwiki by default yet, therefore only concerns mobile users (section heading) and beta feature users. I should have been more scientific in terms of knowing what I was complaining about.

Nevertheless, I wonder if CommentFormatter::postprocessReplyTool can be further optimized, by avoiding instantiating and serializing the reply button each time? Besides, for the HTML DOM operations, does T393922 means Dom\HTMLDocument will be used in the distant future (when running PHP 8.4+)?

Another thought I have is to somehow move the results of mw:dt-commentcount and mw:dt-authorcount into the parser cache, when the page already depends on the UI language (to avoid exacerbating cache key cardinality issue). On zhwiki lots of pages' parser cache key contain "userlang=" instead of "canonical", for reasons not known to me. However, I think talk pages are not viewed so often to make such changes worthwhile.

Change #1189940 merged by jenkins-bot:

[mediawiki/extensions/ReportIncident@master] Only parse DiscussionTools overflow menu message once per pageview

https://gerrit.wikimedia.org/r/1189940