Page MenuHomePhabricator

Submitting Special:EditWatchlist/clear fails with HTTP 500 Error (timeout, OOM)
Closed, ResolvedPublic

Description

Hi,

I have nearly a million pages on my commons watchlist. Inbetween it often runs in "timeout".

And even if I try to clear my watchlist through https://commons.wikimedia.org/wiki/Special:EditWatchlist/clear
I get the error:

client side
Request from XX.XX.XX.XX via cp3062 frontend, Varnish XID 855093126
Upstream caches: cp3062 int
Error: 500, Internal Server Error at Wed, 22 Jan 2020 19:37:16 GMT

Can a Server-Admin please clear my watchlist directly in the Database please.

Thanks in advance.

JuTa


server side
2020-01-22T19:37:16

Allowed memory size of 692060160 bytes exhausted (tried to allocate 4096 bytes)

#0 /srv/mediawiki/php-1.35.0-wmf.15/includes/watcheditem/WatchedItemStore.php(639): unknown()
#1 /srv/mediawiki/php-1.35.0-wmf.15/includes/specials/SpecialEditWatchlist.php(345): WatchedItemStore->getWatchedItemsForUser()
#2 /srv/mediawiki/php-1.35.0-wmf.15/includes/specials/SpecialEditWatchlist.php(259): SpecialEditWatchlist->getWatchlist()
#3 /srv/mediawiki/php-1.35.0-wmf.15/includes/htmlform/HTMLForm.php(694): SpecialEditWatchlist->submitClear()
#4 /srv/mediawiki/php-1.35.0-wmf.15/includes/htmlform/HTMLForm.php(586): HTMLForm->trySubmit()
#5 /srv/mediawiki/php-1.35.0-wmf.15/includes/htmlform/HTMLForm.php(601): HTMLForm->tryAuthorizedSubmit()
#6 /srv/mediawiki/php-1.35.0-wmf.15/includes/specials/SpecialEditWatchlist.php(122): HTMLForm->show()
#7 /srv/mediawiki/php-1.35.0-wmf.15/includes/specialpage/SpecialPage.php(575): SpecialEditWatchlist->execute()
#8 /srv/mediawiki/php-1.35.0-wmf.15/includes/specialpage/SpecialPageFactory.php(611): SpecialPage->run()
#9 /srv/mediawiki/php-1.35.0-wmf.15/includes/MediaWiki.php(298): MediaWiki\Special\SpecialPageFactory->executePath()
#10 /srv/mediawiki/php-1.35.0-wmf.15/includes/MediaWiki.php(967): MediaWiki->performRequest()
#11 /srv/mediawiki/php-1.35.0-wmf.15/includes/MediaWiki.php(530): MediaWiki->main()
#12 /srv/mediawiki/php-1.35.0-wmf.15/index.php(46): MediaWiki->run()
#13 /srv/mediawiki/w/index.php(3): require()

Open issues that are related (but different):

The same issue, but previous iteration of it:

Event Timeline

Urbanecm subscribed.

There's not really a maintenance script we can use on this one, but tagging with Wikimedia-maintenance-script-run anyway for lack of better tag.

In theory, we can delete the rows using runBatchedQuery, but maybe creating a script purgeWatchlist.php that would do it is a better solution?

There's not really a maintenance script we can use on this one, but tagging with Wikimedia-maintenance-script-run anyway for lack of better tag.

In theory, we can delete the rows using runBatchedQuery, but maybe creating a script purgeWatchlist.php that would do it is a better solution?

This shouldn't be necessary.

The code should fire off a job queue job to do this, the timeout suggests it probably isn't (so this may be a MW core bug rather than specifically a site issue)

	private function clearUserWatchedItems( $current, $messageFor ) {
		$watchedItemStore = MediaWikiServices::getInstance()->getWatchedItemStore();
		if ( $watchedItemStore->clearUserWatchedItems( $this->getUser() ) ) {
			$this->successMessage = $this->msg( 'watchlistedit-' . $messageFor . '-done' )->parse();
			$this->successMessage .= ' ' . $this->msg( 'watchlistedit-' . $messageFor . '-removed' )
					->numParams( count( $current ) )->parse();
			$this->getUser()->invalidateCache();
		} else {
			$watchedItemStore->clearUserWatchedItemsUsingJobQueue( $this->getUser() );
			$this->successMessage = $this->msg( 'watchlistedit-clear-jobqueue' )->parse();
		}
	}

Which then in the first instance calls...

	/**
	 * Deletes ALL watched items for the given user when under
	 * $updateRowsPerQuery entries exist.
	 *
	 * @since 1.30
	 *
	 * @param UserIdentity $user
	 *
	 * @return bool true on success, false when too many items are watched
	 */
	public function clearUserWatchedItems( UserIdentity $user ) {
		if ( $this->countWatchedItems( $user ) > $this->updateRowsPerQuery ) {
			return false;
		}

		$dbw = $this->loadBalancer->getConnectionRef( DB_MASTER );
		$dbw->delete(
			'watchlist',
			[ 'wl_user' => $user->getId() ],
			__METHOD__
		);
		$this->uncacheAllItemsForUser( $user );

		return true;
	}

It should be using the job queue for anything over 100 rows...

reedy@deploy1001:~$ mwscript eval.php commonswiki
> var_dump( $wgUpdateRowsPerQuery );
int(100)

There's not really a maintenance script we can use on this one, but tagging with Wikimedia-maintenance-script-run anyway for lack of better tag.

In theory, we can delete the rows using runBatchedQuery, but maybe creating a script purgeWatchlist.php that would do it is a better solution?

I will try to make it. :)

Change 567469 had a related patch set uploaded (by Zoranzoki21; owner: Zoranzoki21):
[mediawiki/core@master] Create maintenance script for purge of user's watchlist

https://gerrit.wikimedia.org/r/567469

I can see what needs to be fixed for the special page as I'll take a look at this.

Thanks @Addshore. Guessing getWatchlist() is the bottleneck here?

I imagine it's trying to count 1M+ rows from the watchlist table

	/**
	 * @since 1.31
	 * @param UserIdentity $user
	 * @return int
	 */
	public function countWatchedItems( UserIdentity $user ) {
		$dbr = $this->getConnectionRef( DB_REPLICA );
		$return = (int)$dbr->selectField(
			'watchlist',
			'COUNT(*)',
			[
				'wl_user' => $user->getId()
			],
			__METHOD__
		);

		return $return;
	}

Thanks @Addshore. Guessing getWatchlist() is the bottleneck here?

Yup, I'm pretty sure it is getWatchlist

Yup, the count returns in well under a second.

addshore@deploy1001:~$ mwscript shell.php --wiki=commonswiki
Psy Shell v0.9.12 (PHP 7.2.26-1+0~20191218.33+debian9~1.gbpb5a340+wmf1 — cli) by Justin Hileman
>>> $m = \MediaWiki\MediaWikiServices::getInstance();
=> MediaWiki\MediaWikiServices {#208}
>>> $w = $m->getWatchedItemStore();
=> WatchedItemStore {#2880}
>>> $u = User::newFromName("JuTa")
=> User {#2881
     +mId: null,
     +mName: "JuTa",
     +mRealName: null,
     +mEmail: null,
     +mTouched: null,
     +mEmailAuthenticated: null,
     +mOptionsLoaded: false,
     +mFrom: "name",
     +mBlockedby: -1,
     +mHideName: null,
     +mOptions: null,
     +mBlock: null,
   }
>>> $w->countWatchedItems($u);
=> 1862518

I'll go and move some logic around in the special page so that getWatchlist is not needed.

Change 567831 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/core@master] When clearing don't load the watchlist if we must clear through a job

https://gerrit.wikimedia.org/r/567831

Should be able to abandon the maint script as it shouldn't be needed once the above fix is merged.
Also @JuTa if your okay with waiting until next week to try this out in the wild I won't bother clearing your watchlist now :)

I'm happy to wait. I created "my" workarround by reducing the count of "lines" in my watchlist preferences. Currently, which lesser lines I dont get any timeouts anymore.

Change 567469 abandoned by Zoranzoki21:
Create maintenance script for purge of user's watchlist

Reason:
Per Addshore's comment in task.

https://gerrit.wikimedia.org/r/567469

Krinkle renamed this task from Please clear my watchlist on commons to Submitting Special:EditWatchlist/clear times out and fiails with HTTP 500 Error.Jan 29 2020, 5:41 PM
Krinkle renamed this task from Submitting Special:EditWatchlist/clear times out and fiails with HTTP 500 Error to Submitting Special:EditWatchlist/clear fails with HTTP 500 Error (timeout, OOM).
Krinkle updated the task description. (Show Details)
Krinkle triaged this task as Medium priority.Jan 29 2020, 5:44 PM
Krinkle updated the task description. (Show Details)
Krinkle moved this task from Untriaged to Jan2020/1.35-wmf.14 on the Wikimedia-production-error board.
Krinkle subscribed.

Found the error in Logstash with more details:

2020-01-22T19:37:16

Allowed memory size of 692060160 bytes exhausted (tried to allocate 4096 bytes)

#0 /srv/mediawiki/php-1.35.0-wmf.15/includes/watcheditem/WatchedItemStore.php(639): unknown()
#1 /srv/mediawiki/php-1.35.0-wmf.15/includes/specials/SpecialEditWatchlist.php(345): WatchedItemStore->getWatchedItemsForUser()
#2 /srv/mediawiki/php-1.35.0-wmf.15/includes/specials/SpecialEditWatchlist.php(259): SpecialEditWatchlist->getWatchlist()
#3 /srv/mediawiki/php-1.35.0-wmf.15/includes/htmlform/HTMLForm.php(694): SpecialEditWatchlist->submitClear()
#4 /srv/mediawiki/php-1.35.0-wmf.15/includes/htmlform/HTMLForm.php(586): HTMLForm->trySubmit()
#5 /srv/mediawiki/php-1.35.0-wmf.15/includes/htmlform/HTMLForm.php(601): HTMLForm->tryAuthorizedSubmit()
#6 /srv/mediawiki/php-1.35.0-wmf.15/includes/specials/SpecialEditWatchlist.php(122): HTMLForm->show()
#7 /srv/mediawiki/php-1.35.0-wmf.15/includes/specialpage/SpecialPage.php(575): SpecialEditWatchlist->execute()
#8 /srv/mediawiki/php-1.35.0-wmf.15/includes/specialpage/SpecialPageFactory.php(611): SpecialPage->run()
#9 /srv/mediawiki/php-1.35.0-wmf.15/includes/MediaWiki.php(298): MediaWiki\Special\SpecialPageFactory->executePath()
#10 /srv/mediawiki/php-1.35.0-wmf.15/includes/MediaWiki.php(967): MediaWiki->performRequest()
#11 /srv/mediawiki/php-1.35.0-wmf.15/includes/MediaWiki.php(530): MediaWiki->main()
#12 /srv/mediawiki/php-1.35.0-wmf.15/index.php(46): MediaWiki->run()
#13 /srv/mediawiki/w/index.php(3): require()

So while it was slow, the immediate issue that caused it to fail was memory, not ellapsed time.

Change 567831 merged by jenkins-bot:
[mediawiki/core@master] When clearing don't load the watchlist if we must clear through a job

https://gerrit.wikimedia.org/r/567831