Page MenuHomePhabricator

EditWatchlist times out when watching 50k pages
Open, Needs TriagePublicPRODUCTION ERROR

Description

Largish watchlists seem to cause time-outs. I'm currently watching about 50k photos on Commons, and I can't access Special:EditWatchlist any more. Is there a way that this could be fixed?

List of steps to reproduce (step by step, including full links if applicable):

What happens?:

  • Server timed out
  • The maximum request time of 60 seconds was exceeded.
  • [30a26a7d-dd24-4035-abb9-e04e2db7c2bc] 2022-05-28 18:40:00: Fatal exception of type "Wikimedia\RequestTimeout\RequestTimeoutException"

What should have happened instead?:

  • List should be displayed
  • At least give a link to the raw watchlist page?

Error
normalized_message
[{reqId}] {exception_url}   Wikimedia\RequestTimeout\RequestTimeoutException: The maximum execution time of {limit} seconds was exceeded
exception.trace
from /srv/mediawiki/php-1.39.0-wmf.13/vendor/wikimedia/request-timeout/src/Detail/ExcimerTimerWrapper.php(97)
#0 /srv/mediawiki/php-1.39.0-wmf.13/vendor/wikimedia/request-timeout/src/Detail/ExcimerTimerWrapper.php(72): Wikimedia\RequestTimeout\Detail\ExcimerTimerWrapper->onTimeout(integer)
#1 /srv/mediawiki/php-1.39.0-wmf.13/includes/libs/MapCacheLRU.php(114): Wikimedia\RequestTimeout\Detail\ExcimerTimerWrapper->Wikimedia\RequestTimeout\Detail\{closure}(integer)
#2 /srv/mediawiki/php-1.39.0-wmf.13/includes/cache/LinkCache.php(327): MapCacheLRU->set(string, array)
#3 /srv/mediawiki/php-1.39.0-wmf.13/includes/cache/LinkCache.php(526): LinkCache->addGoodLinkObjFromRow(TitleValue, stdClass, integer)
#4 /srv/mediawiki/php-1.39.0-wmf.13/includes/cache/LinkCache.php(418): LinkCache->getGoodLinkRow(integer, string, array, integer)
#5 /srv/mediawiki/php-1.39.0-wmf.13/includes/linker/LinkRenderer.php(425): LinkCache->addLinkObj(Title)
#6 /srv/mediawiki/php-1.39.0-wmf.13/includes/linker/LinkRenderer.php(236): MediaWiki\Linker\LinkRenderer->getLinkClasses(Title)
#7 /srv/mediawiki/php-1.39.0-wmf.13/includes/linker/LinkRenderer.php(158): MediaWiki\Linker\LinkRenderer->makeKnownLink(Title, NULL, array, array)
#8 /srv/mediawiki/php-1.39.0-wmf.13/includes/specials/SpecialEditWatchlist.php(716): MediaWiki\Linker\LinkRenderer->makeLink(Title)
#9 /srv/mediawiki/php-1.39.0-wmf.13/includes/specials/SpecialEditWatchlist.php(654): SpecialEditWatchlist->buildRemoveLine(Title, string)
#10 /srv/mediawiki/php-1.39.0-wmf.13/includes/specials/SpecialEditWatchlist.php(200): SpecialEditWatchlist->getNormalForm()
#11 /srv/mediawiki/php-1.39.0-wmf.13/includes/specials/SpecialEditWatchlist.php(164): SpecialEditWatchlist->executeViewEditWatchlist()
#12 /srv/mediawiki/php-1.39.0-wmf.13/includes/specialpage/SpecialPage.php(688): SpecialEditWatchlist->execute(integer)
#13 /srv/mediawiki/php-1.39.0-wmf.13/includes/specialpage/SpecialPageFactory.php(1415): SpecialPage->run(NULL)
#14 /srv/mediawiki/php-1.39.0-wmf.13/includes/MediaWiki.php(316): MediaWiki\SpecialPage\SpecialPageFactory->executePath(string, RequestContext)
#15 /srv/mediawiki/php-1.39.0-wmf.13/includes/MediaWiki.php(912): MediaWiki->performRequest()
#16 /srv/mediawiki/php-1.39.0-wmf.13/includes/MediaWiki.php(566): MediaWiki->main()
#17 /srv/mediawiki/php-1.39.0-wmf.13/index.php(50): MediaWiki->run()
#18 /srv/mediawiki/php-1.39.0-wmf.13/index.php(46): wfIndexMain()
#19 /srv/mediawiki/w/index.php(3): require(string)
#20 {main}

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Mike_Peel renamed this task from Watchlist to Watchlist times out when watching 50k pages.May 28 2022, 6:41 PM
Mike_Peel renamed this task from Watchlist times out when watching 50k pages to EditWatchlist times out when watching 50k pages.

As far as I know, this is not really a "bug". Server timing out is precisely the way to deal with expensive requests. @Mike_Peel Here's a handy link to your raw watchlist: c:Special:EditWatchlist/raw.

Aklapper changed the subtype of this task from "Bug Report" to "Production Error".May 28 2022, 9:03 PM
Aklapper added a project: Performance Issue.

This reminds me of T265347, T95888, T47380, T68212, T69123. And T41510.

I have had a similar problem on en.WP with about 19K items on my watchlist. Neither Edit or Edit raw would work - always timed out. And once that happens there is no way to prune your watchlist apart from article by article unwatching, which is what I did probably thousands (literally) of times until I finally got it to the point where in a lucky moment it finally opened with Edit raw and I zapped a lot of them. I understand why we need to have timeouts as a server may genuinely be malfunctioning but I think the time-outs should be relative to the size of the watchlist, or retrieve the watchlist in chunks, each of which should be possible within the current time-out.

The raw watchlist loads fine for me, although I haven't tried saving it (it would be a lot of work to rebuild it manually if the saving went wrong!).

Tgr subscribed.

In theory having a paged UI for watchlists (split by namespace, then by title prefix) doesn't seem super hard. Not sure if it's more convenient than raw editing though.

Two big advantages of the normal EditWatchlist over raw are being able to see which items have been redirected or deleted, without having to check them manually.

Two big advantages of the normal EditWatchlist over raw are being able to see which items have been redirected or deleted, without having to check them manually.

I guess this is because you see the red links (or not?)

I added 10k items to my watchlist on testwiki to play around with this a little bit
And I profiled the loading of the page https://performance.wikimedia.org/xhgui/run/view?id=62968ac1c0fce6c743114335

Looking through the profile it looks like a bunch of time is spent in the MediaWiki LinkCache https://performance.wikimedia.org/xhgui/run/symbol?id=62968ac1c0fce6c743114335&symbol=MediaWiki%5CPage%5CPageStore%3A%3AMediaWiki%5CPage%5C%7Bclosure%7D
Looking further up the tree this ends up being on buildRemoveLine https://github.com/wikimedia/mediawiki/blob/master/includes/specials/SpecialEditWatchlist.php#L654
So in essence the thing that makes this page expensive to render is the fact that link needs to be rendered and the check needs to happen to see if the page exists or not.

In theory having a paged UI for watchlists (split by namespace, then by title prefix) doesn't seem super hard. Not sure if it's more convenient than raw editing though.

This would indeed lead to slightly smaller lists of items to iterator over to generate these lines

The only other option that I would really see would be some form of batching in the lookups in the loop that makes the page
https://github.com/wikimedia/mediawiki/blob/0bacedd339e2f335f46a114d045a3a168447b9ae/includes/specials/SpecialEditWatchlist.php#L651-L657

This would probably require some abstraction or alternative to https://github.com/wikimedia/mediawiki/blob/0bacedd339e2f335f46a114d045a3a168447b9ae/includes/page/PageStore.php#L139 that could be used (PageStore::getPageByName)

This would probably require some abstraction or alternative to https://github.com/wikimedia/mediawiki/blob/0bacedd339e2f335f46a114d045a3a168447b9ae/includes/page/PageStore.php#L139 that could be used (PageStore::getPageByName)

Wouldn't LinkBatch cover what we need (red links and redirects)?

This would probably require some abstraction or alternative to https://github.com/wikimedia/mediawiki/blob/0bacedd339e2f335f46a114d045a3a168447b9ae/includes/page/PageStore.php#L139 that could be used (PageStore::getPageByName)

Wouldn't LinkBatch cover what we need (red links and redirects)?

For normal pages the special page is using a LinkBatch, but for pages in the File namespace each file is additional queried explicit as needed by Title::isAlwaysKnown to make foreign files blue/red.