Page MenuHomePhabricator

Bad rows in discussiontools_item_pages, discussiontools_item_revisions with itp_items_id, itr_itemid_id, itr_items_id equal to 0
Closed, ResolvedPublic

Description

On several production Wikimedia wikis, we have rows in discussiontools_item_pages and discussiontools_item_revisions tables where itp_items_id, itr_itemid_id, or itr_items_id field is equal to 0. This should not be possible, as these fields are references to auto-increment fields that start at 1.

I ran into this today when examining this error: T315510#8937745 and noticing that the existing row on dewiki causing the duplicate key error had itr_items_id=0:

select * from discussiontools_item_revisions where itr_itemid_id=3134897 and itr_revision_id=209812431;
itr_iditr_itemid_iditr_revision_iditr_items_iditr_parent_iditr_transcludedfromitr_levelitr_headinglevel
370024331348972098124310NULLNULL02

On large wikis there are a few thousands of affected rows.

select count(*) from discussiontools_item_pages where itp_items_id=0;
select count(*) from discussiontools_item_revisions where itr_itemid_id=0 or itr_items_id=0;

Unfortunately some of the faulty entries are recent, so they're probably still being inserted. Once we discover what is causing this (probably something related to T323079 / T323080), we'll need to fix the existing rows. The maintenance script persistRevisionThreadItems.php can probably be adapted to this task (just delete everything for the affected revisions and re-process them).

Related Objects

Event Timeline

Change 938390 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/DiscussionTools@master] ThreadItemStore: Look harder for invisible rows

https://gerrit.wikimedia.org/r/938390

Change 938391 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/DiscussionTools@master] Add PersistFixBadRowsWithZeroes maint script

https://gerrit.wikimedia.org/r/938391

Change 938390 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@master] ThreadItemStore: Look harder for invisible rows

https://gerrit.wikimedia.org/r/938390

Change 938391 abandoned by Bartosz Dziewoński:

[mediawiki/extensions/DiscussionTools@master] Add PersistFixBadRowsWithZeroes maint script

Reason:

Not needed

https://gerrit.wikimedia.org/r/938391

I abandoned the new maintenance script addition after @Ladsgroup suggested that we don't actually need to do anything about the existing bad rows. They indicate that the data was corrupted, but it probably has already been re-generated when the page was edited (or just parsed) the next time; and if it hasn't, it can be re-generated by purging the page. Some data about old revisions of pages may be missing, but that's not a big deal, as we mostly care about the latest revision of every page (for example, our backfilling maintenance script also only processes latest revisions).

matmarex added a project: Skipped QA.

Instead of manual QA, we should check that bad rows are no longer being generated. I'm running a query now to check how many of them exist on each wiki. I'll re-run it in a few days, and hopefully the numbers will be the same.

Change 959370 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/DiscussionTools@master] ThreadItemStore: Fix rows with itr_items_id=0 corrupted by T339882

https://gerrit.wikimedia.org/r/959370

Change 959370 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@master] ThreadItemStore: Fix rows with itr_items_id=0 corrupted by T339882

https://gerrit.wikimedia.org/r/959370