Page MenuHomePhabricator

Remove "stub threshold" preference
Closed, ResolvedPublic

Description

NOTE: The intent of this task has been announced to the community on TechNews 26/2021. The consultation can be found at Performance_Dependent_User_Preferences. No objections were raised.

Summary:
The "stub threshold" option drastically degrades performance of page views for users who have it enabled. It also is a special case that introduces complexity and surprising behavior in the source code. The value of the feature, when working as intended, also seems dubious. It is probably best to simply remove the option.

Context:
The "stub threshold" option allows users to highlight links to articles that are smaller than a given size. The intend was originally to make it easier for editors to find articles that need attention in a field that they care about. Currently, seven values between fifty bytes and ten thousand bytes are supported (plus "disabled"). The highlighting of the links is implemented in the LinkRenderer class used by the wikitext parser.

Problems:
Usually, the HTML generated by the wikitext parser is cached for later re-use. This is done because parsing is relatively slow - depending on the size and complexity of the page, it can take several seconds. However, when the "stub threshold" option is used, this caching is disabled. This is necessary because it would not be feasible to update the cache whenever the size of a linked page changes. Also, if all of the eight supported values were in popular use, this would lead to eight times as many cache entries (cache fragmentation).

Because the stub threshold disables the use of the parser cache, all pages have to be parsed on the fly for users that have this option set. This makes loading the pages very slow.

Also, the mechanism that bypasses the parser cache is somewhat obscure and has led to confusion in the past. It is implemented in ParserOptions::isSafeToCache(), but the option itself is not mentioned there, bypassing the cache is a result of "stubthreshold" being counted as a "used" option but not being listed as a "cache varying" option. This is surprising for a parser option coming from a user preference.

Finally, the benefit of the option seems dubious. While providing a way to discover articles that need attention while reading is a nice idea, going simply by the size of the page is probably not the best mechanism. If there is sufficient demand for this feature, other options should be explored, perhaps based on categories or other metadata.

Usage:
An ad-hoc analysis among users of en.wikipedia.org who have edited in the last 30 days shows that about 0.25% of active editors have the "stub threshold" option set (roughly 500 out of 200,000). The median edit count among these users is 1675, pointing to a niche feature used by a few power users.

Solution:
There seems to be no feasible remedy for the performance issue mentioned above. A complete redesign of the feature would be necessary. Since the feature does not seem to be used much, it seems best to simply remove it from core. If desirable, the underlying need could be addressed by asynchronously loading meta-data about linked pages to surface pages that need attention.

Event Timeline

daniel renamed this task from Remove stub threashold preference to Remove "stub threshold" preference.Jun 14 2021, 11:05 AM
daniel updated the task description. (Show Details)
BPirkle triaged this task as Medium priority.Jun 22 2021, 9:03 PM
BPirkle moved this task from Inbox to Feature Requests to Review on the Platform Engineering board.

Would be a good idea to get a count of the thresholds employed, separated into active and inactive user buckets.

I think this one can be safely re-implemented as a user script / gadget (outside of the Rendering preferences) that collects all the links to articles on a page and makes an API request for a judgement on whether their page size is under/over stub threshold. This is how a number of other things in the similar vein are implemented (for example, a script in Russian WP that shows links to all [FlaggedRevs] unreviewed pages in a different colour). If it has a strain on MediaWiki, I definitely do not see it as useful enough to remain in the preferences.

Change 709245 had a related patch set uploaded (by Ppchelko; author: Ppchelko):

[mediawiki/core@master] WIP: Remove stub threshold feature

https://gerrit.wikimedia.org/r/709245

Change 709245 merged by jenkins-bot:

[mediawiki/core@master] Remove stub threshold feature

https://gerrit.wikimedia.org/r/709245

Strong oppose to this change. Please reactivate this option. Why do you change such things without consultation of the communites? Normally we need a voting for such changes.

JJMC89 added a subscriber: JJMC89.

Why do you change such things without consultation of the communites?

See the top of the task description. It was announced, and no objections were raised in the linked consultation.

Why do you change such things without consultation of the communites?

See the top of the task description. It was announced, and no objections were raised in the linked consultation.

Only a few people read the tech news. It was not properly announced in the projects. And of course no one could object this when no one knew of this. You're taking the easy way out...

Only a few people read the tech news. It was not properly announced in the projects. And of course no one could object this when no one knew of this. You're taking the easy way out...

I'd think that local editors who read Tech News would bring it up with their communities if they though it was relevant, and would let know here (or on the relevant talk page) if there were concerns or objections. At least, that's what I would have done when I was a Wikipedia admin. How else could we communicate changes to all the projects in all the languages, with different communication channels and practices? Do you have an idea?

Only a few people read the tech news. It was not properly announced in the projects. And of course no one could object this when no one knew of this. You're taking the easy way out...

I'd think that local editors who read Tech News would bring it up with their communities if they though it was relevant, and would let know here (or on the relevant talk page) if there were concerns or objections. At least, that's what I would have done when I was a Wikipedia admin. How else could we communicate changes to all the projects in all the languages, with different communication channels and practices? Do you have an idea?

In reality the flow of information is not that well working.

I understand that it is complicated to reach everyone. But maybe the devs could leave messages in the mainly used local notice boards of each project like the WMF does when they have important messages (maybe concentrated in a monthly report or so to prevent spamming). That would have more impact.

"I was not personally asked for my opinion" for software changes does not scale with millions of users. :) What you describe is exactly what Tech News has been for for many years now. If you care about changes which have a larger user impact and would like to join conversations earlier, then please follow Tech News. Thanks for your understanding.

@Chaddy To give a bit of background, as someone who works on trying to keep the communities informed, we get about equal amounts of "stop spamming us all the time" and "you never tell us anything". There are so many things happening in the Wikimedia universe (not just coming from the Wikimedia Foundation, but from other local or global initiatives) that it's incredibly time-consuming to keep track. So to solve this, we have, so to speak, tried to create specialised information workflows where we don't overwhelm the people who just want to edit an encyclopedia, but can inform everyone who wants to be kept up to date. This is, in comparison, a fairly minor change -- I do realise that's not true for everyone, that there are people who have depended on this, but it is not key to how the wikis function. I do wish Tech News was better integrated into the workflow of German Wikipedia, like how the English Wikipedia technical Village Pump has subscribed to it to give an example, but it's linked from Wikipedia:Kurier and people are aware of it (and use the content for technical updates on Wikipedia:Projektneuheiten) and if that's the level of updates that are desired, we can't force it on people. If there's a noticeboard or community page on German Wikipedia you feel should have the weekly technical updates, I'd be very happy if you added it there. We really don't want to take people by surprise, but we also don't want to make local pages unusuable for communication in the local wiki community because they're just global announcements, nor do we want to annoy people who are editing in their spare time and want to prioritse other things than weekly technical updates. When something really big happens, we take extra measures to keep people informed, but there's always a cost in demanding editor time, in intruding on the local spaces (especially on smaller wikis) and so on.