Notifications that were part of a bundle that was read need a read timestamp to be set. Otherwise, they are all going to appear as unread.
Description
Event Timeline
Here's one way to do it:
- Change the current code to set the read timestamp on all elements of a bundle when marking as read (so we stop creating data that needs to be migrated)
- Set read_timestamp = bogus date on all bundle_base = 0 (because it's quick)
- (here the new dynamic bundling code can be merged and deployed)
- Loop through all read bundle bases, set their bundled notification's read timestamp to the base read timestamp (slow but it's the right data)
Change 292359 had a related patch set uploaded (by Sbisson):
Mark bundled notifications as read
Change 292365 had a related patch set uploaded (by Sbisson):
Maintenance: Backfill read timestamp on bundled notifications
- Change the current code to set the read timestamp on all elements of a bundle when marking as read (so we stop creating data that needs to be migrated)
https://gerrit.wikimedia.org/r/292359
This would need to be deployed to all wikis before moving on to the next step.
- Set read_timestamp = bogus date on all bundle_base = 0 (because it's quick)
- (here the new dynamic bundling code can be merged and deployed)
Not sure if this can be as quick as I was expecting. We need to only mark the notifications for which the bundle base is read. That's similar to the script in https://gerrit.wikimedia.org/r/292365. WE may be able to do steps 1 and 4 only.
- Loop through all read bundle bases, set their bundled notification's read timestamp to the base read timestamp (slow but it's the right data)
Change 292365 merged by jenkins-bot:
Maintenance: Backfill read timestamp on bundled notifications
As discussed on IRC, the maintenance script needs some changes to work more efficiently. As written, it issues many UPDATE queries that will change zero rows (because most base=1 rows do not have any base=0 rows associated with them), and each of those UPDATE queries involves a full table scan. What the script should do instead is query/compute a list of (user, event_id) pairs that need updates, because that combination uniquely identifies a notification row and UPDATE queries based on that will execute quickly.
Change 292550 had a related patch set uploaded (by Sbisson):
Add column alias support in BatchRowIterator
Change 292551 had a related patch set uploaded (by Sbisson):
Update only required rows
Change 293100 had a related patch set uploaded (by Sbisson):
[WIP] Use usb-select to target rows to update
Change 292551 abandoned by Catrope:
Update only required rows
Reason:
Superseded by https://gerrit.wikimedia.org/r/#/c/293100/
Change 293100 merged by jenkins-bot:
backfillReadBundles.php: Use sub-select to target rows to update
testwiki (14K notfiication rows) took <10s to update 2K rows. test2wiki (62K notification rows) took 14s to update 50K. mediawikiwiki (2.5M notification rows) took 2m50s to update 200K.
enwiki is not that much bigger than mediawikiwiki in this respect, it has about 14M rows of which about 65K need updating. So it sounds like it shouldn't take longer than half an hour to run this on enwiki, and no more than a few hours to run on all wikis.
The bad news is that the verification query (P3215#15008) still returns a nonzero number of rows on testwiki and test2wiki (I didn't check mediawikiwiki because it's big). But rerunning the script quickly prints "Updated 0 notifications".
This seems to be a false positive: inspecting the results does show what appear to be multiple different bundles (belonging to different users) with the same hash and display hash.
I'm guessing this could happen when multiple users are notified of the same series of events.
I ran this on enwiki on Friday and it took about 8 hours. Going to start running it on the rest of group2 now.
Change 294120 had a related patch set uploaded (by Sbisson):
Allow the primary link to set all bundled notifications as read
Change 294120 merged by jenkins-bot:
Allow the primary link to set all bundled notifications as read
Change 294339 had a related patch set uploaded (by Catrope):
Allow the primary link to set all bundled notifications as read
Change 294339 merged by jenkins-bot:
Allow the primary link to set all bundled notifications as read
Mentioned in SAL [2016-06-14T23:15:22Z] <ori@tin> Synchronized php-1.28.0-wmf.6/extensions/Echo: If07369cb1: Allow the primary link to set all bundled notifications as read (T136368) (duration: 00m 34s)
Mentioned in SAL [2016-06-21T12:49:35Z] <RoanKattouw> Running extensions/Echo/maintenance/backfillReadBundles.php on all Echo-enabled wikis for T136368
Done. Only took 16 minutes :) (to be fair, it was a rerun, we first ran it on June 13)