Page MenuHomePhabricator

Remove use of legacy page.page_restrictions field
Closed, ResolvedPublic

Description

In 2007 (MW 1.10), the page_restrictions table was introduced. Rows in this table take precedence over the old page.page_restrictions field (r19095).

Later that year, a maintenance script was introduced to migrate old values (28df3d07e6, r21455).

Today, 12 years later, we still require an extra query to page.page_restrictions when calling Title::getRestrictions(), in addition to the page_restrictions table.

Let's finished this forgotten migration and kill off that field.

Plan
  • Verify that the maintenance script still works. And update it to modern standards as needed in order to make it suitable for running on large WMF wikis.
  • Ensure the updater is automatically run by the database updater.
  • Run the maintenance script on all WMF wikis.
  • Remove all code for queries to the old field. This, in addition to an improved update script that nulls out the column, should fix T30751.
  • Add schema update that removes the field. (core: T35334, wmf: T60674)

I'm not suggesting we drop the column in prod as it seems low priority. The page tables are very large, and the maintenance script will already null-out this field. However, DBAs can decide what they want to do here.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Krinkle moved this task from Limbo to Perf recommendation on the Performance-Team (Radar) board.
Krinkle added a subscriber: aaron.

Marking as perf recommendation for the additional queries required currently.

To reduce the overhead of this in the interim, @aaron added page.page_restrictions to LinkCache (see 93b24207c1cb / https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/496391/).

Marostegui subscribed.

If the column isn't in used, I would prefer if we can remove it from tables.sql, merge it, and then get a ticket to get the schema change done in production (it doesn't have to be done right away, but at least created so we have it as a backlog).
Removing DBA (but I will remain subscribed) as there is not actionable for us here yet (until the schema change in prod needs to actually be done).

Thank you!

Krinkle triaged this task as Medium priority.Jul 18 2019, 10:13 PM

Change 537353 had a related patch set uploaded (by TK-999; owner: TK-999):
[mediawiki/core@master] Refactor the updateRestrictions.php maintenance script

https://gerrit.wikimedia.org/r/537353

Krinkle added a subscriber: WDoranWMF.

@WDoranWMF This was initially set for later I believe, but given Wikia has started work on it for us, perhaps CR could be provided earlier. Just leaving this here for now :)

@WDoranWMF This was initially set for later I believe, but given Wikia has started work on it for us, perhaps CR could be provided earlier. Just leaving this here for now :)

Just to clarify, I contributed this patch in a volunteer capacity :) If in the future I submit a patch as a direct outcome of my work, I will indicate it as appropriate on the relevant ticket and the patch itself.

Change 537353 merged by jenkins-bot:
[mediawiki/core@master] Refactor the updateRestrictions.php maintenance script

https://gerrit.wikimedia.org/r/537353

Krinkle updated the task description. (Show Details)

Ticked off the "maintenance script" and "run by installer" boxes. The above patch fixes the maintenance script. And the installer seems to already be running it as well. E.g. MysqlUpdater::doRestrictionsUpdate since MW 1.10.

Next step unlocked: Run the maintenance script on all WMF wikis.

Change 541601 had a related patch set uploaded (by TK-999; owner: TK-999):
[mediawiki/core@master] [DNM] Remove usages of legacy page.page_restrictions field

https://gerrit.wikimedia.org/r/541601

Change 581141 had a related patch set uploaded (by TK-999; owner: TK-999):
[mediawiki/core@master] Add schema change to make page.page_restrictions column nullable

https://gerrit.wikimedia.org/r/581141

Change 581141 merged by jenkins-bot:
[mediawiki/core@master] Add schema change to make page.page_restrictions column nullable

https://gerrit.wikimedia.org/r/581141

daniel subscribed.

Moving to "ready", since the next step seems to be "Run the maintenance script on all WMF wikis."

@Krinkle I don't see a second query being executed by Title::getRestrictions(). Has that been removed since the ticket was created?

@holger.knust Looking at the code now, I would expect the query happen via loadRestrictionsloadRestrictionsFromRowsLinkCache:

Title.php
	public function loadRestrictions( $oldFashionedRestrictions =,   ) {
		// …
		$loadRestrictionsFromDb = function ( IDatabase $dbr ) use ( $fname, $id ) {
			return iterator_to_array(
				$dbr->select(
					'page_restrictions',
					[ 'pr_type', 'pr_expiry', 'pr_level', 'pr_cascade' ],
					[ 'pr_page' => $id ],
					$fname
				)
			);
		};
		// …
		$rows = $loadRestrictionsFromDb( $dbr );
		$this->loadRestrictionsFromRows( $rows, $oldFashionedRestrictions );
	}

	public function loadRestrictionsFromRows( $rows, $oldFashionedRestrictions = null ) {
		// …
		if ( $this->mOldRestrictions === false ) { # always
			$linkCache = MediaWikiServices::getInstance()->getLinkCache();
			$linkCache->addLinkObj( $this ); # <!-- HERE
			$this->mOldRestrictions = $linkCache->getGoodLinkFieldObj( $this, 'restrictions' );
		}
		// …
	}

The LinkCache class then queries for page.page_protected as-needed. This code changed after this task was filed, in fact on the same day in 2019 (change 496391). Before that change, it was a direct sql query.

This change also made it so that the legacy page.page_restrictions field is in the default SELECT fields for the title cache, which means probably in most cases it no longer requires its own dedicated query, instead we load it upfront for all page table queries, which I guess was made sense as a trade-off since it's usually an empty string.

Aklapper added a subscriber: holger.knust.

Resetting deactivated assignee account.

Ladsgroup added a project: DBA.
Ladsgroup subscribed.

As an insult to the injury, this field actually has different type from production and core. I'm going to do this to be done with it...

Mentioned in SAL (#wikimedia-operations) [2022-05-09T06:23:43Z] <Amir1> start of updateRestrictions.php on s5 (T218446)

Ladsgroup updated the task description. (Show Details)

Change 790257 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/extensions/Wikibase@master] tests: Remove references to legacy page_restrictions field in page table

https://gerrit.wikimedia.org/r/790257

Change 790258 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/core@master] Remove references to legacy page_restrictions field in page table

https://gerrit.wikimedia.org/r/790258

Change 790259 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/extensions/Babel@master] tests: Remove references to legacy page_restrictions field in page table

https://gerrit.wikimedia.org/r/790259

Change 790257 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] tests: Remove references to legacy page_restrictions field in page table

https://gerrit.wikimedia.org/r/790257

Change 790258 merged by jenkins-bot:

[mediawiki/core@master] Remove references to legacy page_restrictions field in page table

https://gerrit.wikimedia.org/r/790258

Change 790259 merged by jenkins-bot:

[mediawiki/extensions/Babel@master] tests: Remove references to legacy page_restrictions field in page table

https://gerrit.wikimedia.org/r/790259

I will let it deploy next week and then I move on to create the schema change patch.

Ladsgroup moved this task from In progress to Done on the DBA board.