Page MenuHomePhabricator

Remove all references to the rev_text_id and ar_text_id fields
Closed, ResolvedPublic5 Estimated Story Points

Description

Any code that accesses rev_text_id and ar_text_id fields is unable to make use of MCR. In order to enable for MCR support, all references to these fields have to be removed from the code. They should be replaced with RevisionStore::getQueryInfo(), RevisionStore::newRevisionFromRow(), and RevisionRecord::getContent().

As a corollary, all references to text.old_id should be removed as well. If rev_text_id and ar_text_id are no longer there, there is no need to reference text.old_id (except in SqlBlobStore). A search for uses of the _text_id suffix in may be helpful to find code that needs fixing.

DOD: This task should be considered done when any code referencing rev_text_id or ar_text_id has either been removed or is gated on the MCR schema migration stage, in core and all extensions deployed by Wikimedia, as well as extensions bundled with MediaWiki releases. Other extensions and external tools may be left to be updated later.

Extensions to fix:

Related Objects

StatusSubtypeAssignedTask
Declineddchen
OpenNone
OpenNone
DuplicateNone
OpenNone
OpenNone
OpenNone
OpenNone
DuplicateNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
Resolvedppelberg
ResolvedKrinkle
OpenNone
OpenNone
OpenNone
OpenNone
StalledNone
OpenNone
OpenNone
StalledNone
OpenNone
Resolveddaniel
ResolvedCCicalese_WMF
Resolveddaniel
ResolvedNone
ResolvedNone
ResolvedCCicalese_WMF
Resolveddaniel
ResolvedPchelolo
Resolveddaniel
ResolvedBPirkle
ResolvedPchelolo
ResolvedClarakosi
Resolveddaniel
ResolvedPchelolo
ResolvedPchelolo
ResolvedPchelolo
ResolvedPchelolo
ResolvedPchelolo
Resolveddaniel
Resolveddaniel
Resolvedholger.knust
ResolvedPchelolo
OpenNone
Open Marostegui
ResolvedBstorm
Opendaniel

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I'm also seeing three few ungated ar_text_id references in core:

  • ApiQueryAllDeletedRevisions::run()
  • ApiQueryDeletedRevisions::run()
  • ApiQueryDeletedrevs::execute()

These all use ar_text_id to load content by joining the text table. I'll refactor that to use the newer approach.

The ar_text_id field shows up a number of other times in core code, but after looking at them individually, I think the rest are intentional, and the above three are the only problematic references remaining.

Unless I'm missing something, @Anomie already did the bulk of the work for the first two (ApiQueryAllDeletedRevisions and ApiQueryDeletedRevisions) in commit 07842be3 as part of T200568, particularly by adding ApiQueryRevisionsBase::extractSlotInfo().

ApiQueryDeletedrevs would require larger changes because, as Anomie noted in the commit message for 07842be3 , "ApiQueryDeletedrevs wasn't touched,
since it has been deprecated since 1.25 anyway." May I remove it? If not, may I no-op it?

I'm posting a change for just the first two bullet points. If I can remove or no-op ApiQueryDeletedrevs, I'll do that as a separate change.

Change 506325 had a related patch set uploaded (by BPirkle; owner: BPirkle):
[mediawiki/core@master] Remove referencs to db field ar_text_id

https://gerrit.wikimedia.org/r/506325

Anomie added a comment.EditedApr 25 2019, 3:34 PM

I don't think ApiQueryDeletedrevs will need larger changes, actually. Like the other accesses to old_id, the JOIN with text there was just to prefetch fields for a call to Revision::getRevisionText(). Removing the JOIN will just make the later getRevisionText() call have to do an extra DB query.

The call to Revision::getRevisionText() without having done the JOIN will result in a deprecation warning once we switch MCR to write-new (instead of write-both), though. But I don't think that needs to block this task. For that matter, I should run a query on the Analytics data to see whether we can just remove ApiQueryDeletedrevs entirely yet.

For that matter, I should run a query on the Analytics data to see whether we can just remove ApiQueryDeletedrevs entirely yet.

T221869: Remove deprecated ApiQueryDeletedRevs

Possibly still worth removing the ar_text_id uses to close out this task though.

Change 506573 had a related patch set uploaded (by BPirkle; owner: BPirkle):
[mediawiki/extensions/DeletePagesForGood@master] Remove references to db fields rev_text_id and ar_text_id

https://gerrit.wikimedia.org/r/506573

Change 506325 merged by jenkins-bot:
[mediawiki/core@master] Remove references to db field ar_text_id

https://gerrit.wikimedia.org/r/506325

Change 522228 had a related patch set uploaded (by BPirkle; owner: BPirkle):
[mediawiki/extensions/Duplicator@master] Only use the revision.rev_text_id if it exists.

https://gerrit.wikimedia.org/r/522228

I've uploaded patches for all issues identified in this task. In an effort to keep this task focused on ar_text_id and rev_text_id, I created several related tasks for things that deserve attention and which are related to MCR, but not directly to rev_text_id or ar_text_id:

T220884 (additional DeletePagesForGood extension updates)
T227836 (maintenance script related to DeletePagesForGood, but possibly also generically useful)
T227840 (additional Duplicator extension updates)

While it may be slightly disingenuous to push the additional work to other tasks and declare this task done, I think it does make sense to move the extension work that is not part of the MCR Schema migration critical path to more focused tasks. Focused tasks are good practice, and extension-specific tasks are more likely to be noticed by people interested in those specific extensions.

Change 524560 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/core@master] Revision::getRevisionText(): support old schemas

https://gerrit.wikimedia.org/r/524560

It seems like our search for usages in extensions failed to take into account usages in joins. There is quite a bit of code that does this. Critically, the Translate extension heavily depends on rev_text_id in joins.

Here's the code search:
https://codesearch.wmflabs.org/extensions/?q=rev_text_id%7Car_text_id&i=nope&files=&repos=

Change 524593 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/extensions/Translate@master] Remove direct access to text table

https://gerrit.wikimedia.org/r/524593

Addshore removed a subscriber: Addshore.Jul 20 2019, 3:07 AM
daniel updated the task description. (Show Details)Jul 22 2019, 2:52 PM
daniel updated the task description. (Show Details)Jul 22 2019, 2:56 PM
daniel updated the task description. (Show Details)
daniel updated the task description. (Show Details)
daniel updated the task description. (Show Details)Jul 22 2019, 3:19 PM
Fjalapeno removed BPirkle as the assignee of this task.Jul 23 2019, 6:23 PM
Fjalapeno added a subscriber: BPirkle.
WDoranWMF raised the priority of this task from Medium to High.Jul 23 2019, 6:24 PM
WDoranWMF removed a project: Core Platform Team.
WDoranWMF lowered the priority of this task from High to Medium.Jul 24 2019, 9:46 PM
WDoranWMF moved this task from MCR to mop on the Core Platform Team board.Jul 26 2019, 6:38 PM
daniel updated the task description. (Show Details)Sep 19 2019, 6:28 PM
daniel updated the task description. (Show Details)Sep 19 2019, 9:06 PM

Change 538113 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/extensions/WikimediaMaintenance@master] Remove obsolete scripts that use rev_text_id.

https://gerrit.wikimedia.org/r/538113

daniel removed holger.knust as the assignee of this task.Sep 20 2019, 12:12 PM
daniel added a subscriber: holger.knust.

Change 538599 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/extensions/AbuseFilter@master] Use a BlobStore for stroing var dumps.

https://gerrit.wikimedia.org/r/538599

Change 538599 abandoned by Daniel Kinzler:
Use a BlobStore for storing var dumps.

Reason:
see I22cf698c5be77506727cbd227c67e037a5d89b5c

https://gerrit.wikimedia.org/r/538599

Change 442358 abandoned by Daniel Kinzler:
MCR WIP Remove references to rev_text_id and ar_text_id

Reason:
Obsolete,see I5ea972bb07ca1cfb3a2ad8ef120aef77e460745c

https://gerrit.wikimedia.org/r/442358

Change 524560 abandoned by Daniel Kinzler:
Revision::getRevisionText(): support old schemas

Reason:
per Anomie's comment

https://gerrit.wikimedia.org/r/524560

daniel closed this task as Resolved.Oct 28 2019, 8:43 PM
daniel claimed this task.

All subtasks are resolved. I just went over the remaining open patches for this task and abandoned a few obsolete ones. There are only two patches remaining open, one for the Duplicator extension and one for the DeletePagesForGood extension. Neither of these is supported by WMF (deployed or bundled).

So, it seems we are done here. Closing.

Change 538599 restored by Daimona Eaytoy:
Use a BlobStore for storing var dumps.

Reason:
This can be rebased on top of I4444cada720ab62d187f2dd0c4760697e465f2ff to remove any concern about back-compat stuff.

@daniel would you like to do that yourself, or should I revamp this patch and you/CPT will review?

https://gerrit.wikimedia.org/r/538599

Change 506573 abandoned by BPirkle:
Remove references to db fields rev_text_id and ar_text_id

Reason:
Extremely outdated

https://gerrit.wikimedia.org/r/506573