Page MenuHomePhabricator

Rendering tags dropdown is slow on Special:Contributions (via ChangeTags::getChangeTagList)
Open, Needs TriagePublic

Description

ChangeTags::getChangeTagList takes several tenths of a second on wikis with many tags (observed 0.5s on cswiki), and increases waiting time on Special:RecentChanges, Special:Contributions, etc., even though the contents of the dropdown menu rarely change.

Internally, ChangeTags::getChangeTagList calls ChangeTags::getChangeTagListSummary, which already uses the WAN cache. But then it follows on with doing up to two wikitext parses for the label and description of each tag (for most tags).

Original proposal:
We should investigate the splitting criteria and put it behind a cache.

Event Timeline

Change #1048847 had a related patch set uploaded (by Matěj Suchánek; author: Matěj Suchánek):

[mediawiki/core@master] Cache ChangeTags::getChangeTagList

https://gerrit.wikimedia.org/r/1048847

Dylsss subscribed.

Copying my description:

This is a heck of a lot of time spent and we can more than halve the request duration just by caching the return value.

History

In https://gerrit.wikimedia.org/r/c/mediawiki/core/+/525882 we started caching the summary which is the unparsed version but decided not to also cache the parsed version due to concerns that the two cache versions would become out of sync. However caching this would be useful, and I think we can invalidate the dependent cache when getChangeTagListSummary cache misses using checkKeys.

It would be really nice if we could resolve this.

Change #1123083 had a related patch set uploaded (by Dylsss; author: Dylsss):

[mediawiki/core@master] Cache parsed change tag list getChangeTagList using WANCache

https://gerrit.wikimedia.org/r/1123083

Caching this is tricky, since the output of message parsing depends on the MessageLocalizer object, which is opaque to us. Splitting just based on the language code is fine in 99% of cases, but in the other 1%, it can depend on current user's preferences, current page's title, and possibly the phase of the moon.

I wonder if we could make some optimizations rather than adding caching. At first glance:

  • getChangeTagList() always parses all of the tag labels and descriptions, but buildTagFilterSelector() only uses the labels – by adding an option to omit descriptions, we could save 50% of run time (probably more than that, since descriptions tend to be more complex)
  • Many tag labels are plain text, but we still parse all of them as wikitext. If we could skip the parsing for those that have no markup, we could probably save lots of time. We could detect that by checking if wfEscapeWikiText( $tagInfo['label'] ) === $tagInfo['label']. (Also, many tag labels that aren't plain text should be: I've been working on that in T372175, but no one wants to review my patches.)

I would only add a cache if these optimizations don't work out.

Change #1130301 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/core@master] ChangeTags: Optimize label and description parsing

https://gerrit.wikimedia.org/r/1130301

Krinkle renamed this task from Cache ChangeTags::getChangeTagList to Rendering tags dropdown many is slow on Special:Contributions (via ChangeTags::getChangeTagList).Mar 27 2025, 2:22 AM
Krinkle updated the task description. (Show Details)

tenths of a second (observed 0.5s on cswiki)

On enwiki it takes two or three whole seconds even.

Two profiles from https://en.wikipedia.org/wiki/Special:Contributions/ :

Both of these were after an unprofiled warmup on the same mwdebug1002 host (note that warmups on k8s-mwdebug don't work because subsequent requests don't route to the same pod, ref T276994)

Change #1130301 merged by jenkins-bot:

[mediawiki/core@master] ChangeTags: Optimize label and description parsing

https://gerrit.wikimedia.org/r/1130301

matmarex renamed this task from Rendering tags dropdown many is slow on Special:Contributions (via ChangeTags::getChangeTagList) to Rendering tags dropdown is slow on Special:Contributions (via ChangeTags::getChangeTagList).Mar 29 2025, 10:36 AM

Change #1133987 had a related patch set uploaded (by Reedy; author: Bartosz Dziewoński):

[mediawiki/core@REL1_43] ChangeTags: Optimize label and description parsing

https://gerrit.wikimedia.org/r/1133987

Change #1133987 merged by jenkins-bot:

[mediawiki/core@REL1_43] ChangeTags: Optimize label and description parsing

https://gerrit.wikimedia.org/r/1133987

tenths of a second (observed 0.5s on cswiki)

On enwiki it takes two or three whole seconds even.

Two profiles from https://en.wikipedia.org/wiki/Special:Contributions/ :

Both of these were after an unprofiled warmup on the same mwdebug1002 host (note that warmups on k8s-mwdebug don't work because subsequent requests don't route to the same pod, ref T276994)

For comparison, profiles after the change, taken using the same method:

getChangeTagList time is down to 0.3 seconds, or a 10x improvement. But I'm not sure if that's good enough, it still seems about 10x slower than I would expect it to be…

The next obvious optimization would be somehow combining the SQL queries for link colors (page existence and disambiguation status). They are batched, but only within a single parse, and each tag label is a separate parse. This seems a bit more tricky than the previous change. Maybe we should consider adding a cache after all?

[…]
The next obvious optimization would be somehow combining the SQL queries for link colors (page existence and disambiguation status). They are batched, but only within a single parse, and each tag label is a separate parse. This seems a bit more tricky than the previous change. Maybe we should consider adding a cache after all?

For the case where perf matters most, we'd don't even need to batch them, since Special:RecentChanges and Special:Contributions can'tt actually link the items in the tags dropdown. They're currenly parsed by Message::parse and them immediately discarded by Sanitizer::stripAllTags.

The only reason these links are receiving those stylings, is because they're written as wikitext interface messages. And even with a generic cache, this would not address the akward "rich text to plain text" kludge.

My suggestion would be to split these interface messages into a plain text message and an (optional) link target. Then, for the dropdown case, the text is readily available without any database queries. For Special:Tags and changelist pagers, a LinkBatch should be easy to add as we'd know all the targets upfront.