Page MenuHomePhabricator

Remove all references to the rev_text_id and ar_text_id fields
Open, NormalPublic5 Story Points

Description

Any code that accesses rev_text_id and ar_text_id fields is unable to make use of MCR. In order to enable for MCR support, all references to these fields have to be removed from the code. They should be replaced with RevisionStore::getQueryInfo(), RevisionStore::newRevisionFromRow(), and RevisionRecord::getContent().

As a corollary, all references to text.old_id should be removed as well. If rev_text_id and ar_text_id are no longer there, there is no need to reference text.old_id (except in SqlBlobStore). A search for uses of the _text_id suffix in may be helpful to find code that needs fixing.

DOD: This task should be considered done when any code referencing rev_text_id or ar_text_id has either been removed or is gated on the MCR schema migration stage, in core and all extensions deployed by Wikimedia, as well as extensions bundled with MediaWiki releases. Other extensions and external tools may be left to be updated later.

Extensions to fix:

  • Translate (T228675)
  • SpamBlacklist
  • EntitySchema (test only)
  • Flow (maintenance script)
  • ReplaceText (bundeled)
  • WikimediaMaintenance (used internally by wmf)

Related Objects

StatusAssignedTask
Declineddchen
OpenNone
OpenNone
DuplicateNone
OpenNone
ResolvedAbit
OpenNone
OpenNone
OpenNone
OpenNone
DuplicateNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
Resolvedppelberg
ResolvedKrinkle
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
ResolvedPchelolo
Resolveddaniel
ResolvedBPirkle
ResolvedPchelolo
ResolvedClarakosi
Resolveddaniel
ResolvedPchelolo
ResolvedPchelolo
ResolvedPchelolo
ResolvedPchelolo
OpenPchelolo
ResolvedPchelolo
Resolveddaniel
Resolveddaniel
Resolvedholger.knust
ResolvedPchelolo

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Per discussion in this task, I created new task T220450 for the additional Postgres changes (searchindex table).

Change 499013 merged by jenkins-bot:
[mediawiki/core@master] Remove references to field rev_text_id

https://gerrit.wikimedia.org/r/499013

WDoranWMF set the point value for this task to 1.Apr 15 2019, 3:15 PM

I still see a couple of usages of rev_text_id or ar_text_id in extensions that will need to be fixed[1]:

  • ActionDeletePagePermanently.php in DeletePagesForGood
  • Duplicator.page.php in Duplicator
  • bug-53687/fixOrphans.php in WikimediaMaintenance can probably be ignored

[1] https://codesearch.wmflabs.org/extensions/?q=-%3Erev_text_id%7C-%3Ear_text_id&i=nope&files=&repos=

One patch waiting for review, and a few things to do in extensions. Nearly done.

Change 500755 merged by jenkins-bot:
[mediawiki/core@master] Remove references to field rev_text_id

https://gerrit.wikimedia.org/r/500755

I'm also seeing three few ungated ar_text_id references in core:

  • ApiQueryAllDeletedRevisions::run()
  • ApiQueryDeletedRevisions::run()
  • ApiQueryDeletedrevs::execute()

These all use ar_text_id to load content by joining the text table. I'll refactor that to use the newer approach.

The ar_text_id field shows up a number of other times in core code, but after looking at them individually, I think the rest are intentional, and the above three are the only problematic references remaining.

Unless I'm missing something, @Anomie already did the bulk of the work for the first two (ApiQueryAllDeletedRevisions and ApiQueryDeletedRevisions) in commit 07842be3 as part of T200568, particularly by adding ApiQueryRevisionsBase::extractSlotInfo().

ApiQueryDeletedrevs would require larger changes because, as Anomie noted in the commit message for 07842be3 , "ApiQueryDeletedrevs wasn't touched,
since it has been deprecated since 1.25 anyway." May I remove it? If not, may I no-op it?

I'm posting a change for just the first two bullet points. If I can remove or no-op ApiQueryDeletedrevs, I'll do that as a separate change.

Change 506325 had a related patch set uploaded (by BPirkle; owner: BPirkle):
[mediawiki/core@master] Remove referencs to db field ar_text_id

https://gerrit.wikimedia.org/r/506325

Anomie added a comment.EditedApr 25 2019, 3:34 PM

I don't think ApiQueryDeletedrevs will need larger changes, actually. Like the other accesses to old_id, the JOIN with text there was just to prefetch fields for a call to Revision::getRevisionText(). Removing the JOIN will just make the later getRevisionText() call have to do an extra DB query.

The call to Revision::getRevisionText() without having done the JOIN will result in a deprecation warning once we switch MCR to write-new (instead of write-both), though. But I don't think that needs to block this task. For that matter, I should run a query on the Analytics data to see whether we can just remove ApiQueryDeletedrevs entirely yet.

For that matter, I should run a query on the Analytics data to see whether we can just remove ApiQueryDeletedrevs entirely yet.

T221869: Remove deprecated ApiQueryDeletedRevs

Possibly still worth removing the ar_text_id uses to close out this task though.

Change 506573 had a related patch set uploaded (by BPirkle; owner: BPirkle):
[mediawiki/extensions/DeletePagesForGood@master] Remove references to db fields rev_text_id and ar_text_id

https://gerrit.wikimedia.org/r/506573

Change 506325 merged by jenkins-bot:
[mediawiki/core@master] Remove references to db field ar_text_id

https://gerrit.wikimedia.org/r/506325

Change 522228 had a related patch set uploaded (by BPirkle; owner: BPirkle):
[mediawiki/extensions/Duplicator@master] Only use the revision.rev_text_id if it exists.

https://gerrit.wikimedia.org/r/522228

I've uploaded patches for all issues identified in this task. In an effort to keep this task focused on ar_text_id and rev_text_id, I created several related tasks for things that deserve attention and which are related to MCR, but not directly to rev_text_id or ar_text_id:

T220884 (additional DeletePagesForGood extension updates)
T227836 (maintenance script related to DeletePagesForGood, but possibly also generically useful)
T227840 (additional Duplicator extension updates)

While it may be slightly disingenuous to push the additional work to other tasks and declare this task done, I think it does make sense to move the extension work that is not part of the MCR Schema migration critical path to more focused tasks. Focused tasks are good practice, and extension-specific tasks are more likely to be noticed by people interested in those specific extensions.

Change 524560 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/core@master] Revision::getRevisionText(): support old schemas

https://gerrit.wikimedia.org/r/524560

It seems like our search for usages in extensions failed to take into account usages in joins. There is quite a bit of code that does this. Critically, the Translate extension heavily depends on rev_text_id in joins.

Here's the code search:
https://codesearch.wmflabs.org/extensions/?q=rev_text_id%7Car_text_id&i=nope&files=&repos=

Change 524593 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/extensions/Translate@master] Remove direct access to text table

https://gerrit.wikimedia.org/r/524593

Addshore removed a subscriber: Addshore.Jul 20 2019, 3:07 AM
daniel updated the task description. (Show Details)Jul 22 2019, 2:52 PM
daniel updated the task description. (Show Details)Jul 22 2019, 2:56 PM
daniel updated the task description. (Show Details)
daniel updated the task description. (Show Details)
daniel updated the task description. (Show Details)Jul 22 2019, 3:19 PM
Fjalapeno removed BPirkle as the assignee of this task.Jul 23 2019, 6:23 PM
Fjalapeno added a subscriber: BPirkle.
WDoranWMF raised the priority of this task from Normal to High.Jul 23 2019, 6:24 PM
WDoranWMF removed a project: Core Platform Team.
WDoranWMF lowered the priority of this task from High to Normal.Jul 24 2019, 9:46 PM
WDoranWMF moved this task from MCR to mop on the Core Platform Team board.Jul 26 2019, 6:38 PM
WDoranWMF changed the point value for this task from 1 to 5.Tue, Sep 17, 4:28 PM
daniel updated the task description. (Show Details)Thu, Sep 19, 6:28 PM
daniel updated the task description. (Show Details)Thu, Sep 19, 9:06 PM

Change 538113 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/extensions/WikimediaMaintenance@master] Remove obsolete scripts that use rev_text_id.

https://gerrit.wikimedia.org/r/538113

daniel removed holger.knust as the assignee of this task.Fri, Sep 20, 12:12 PM
daniel added a subscriber: holger.knust.

Change 538599 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/extensions/AbuseFilter@master] Use a BlobStore for stroing var dumps.

https://gerrit.wikimedia.org/r/538599