Replayed events are purged based on current timestamp
Closed, ResolvedPublic

Description

After discussing with @Pchelolo and simulating locally we discovered that purging occurs with the current time, so when replaying an edit event from 22hrs ago, we use the current time so speed, max inactivity and max age will all be derived from the current timestamp.

This is significant when we purge every 100 events - as most early events will be dropped on account of speed... consider 10 edits in the first hour - when purging at 22 hrs:
10 edits/21hrs*60 minutes is very different from 10 edits/60 minute :)

If a page is edited 10 times every hour, we'll lose many of the accumulated edits and it won't show.

In addition to this almost all early events will be dropped on account of inactivity. An event processed 22hrs ago will be seen as inactive for 22hrs which is not true.

So when purging we should use the date of the last event.
This is very trivial to fix and should result in very different results.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 9 2017, 11:46 PM
Jdlrobson moved this task from Backlog to Next up on the Trending-Service board.Mar 9 2017, 11:46 PM

Change 342152 had a related patch set uploaded (by Jdlrobson):
[mediawiki/services/trending-edits] Purge based on current timestamp

https://gerrit.wikimedia.org/r/342152

Change 342152 merged by Ppchelko:
[mediawiki/services/trending-edits] Purge based on current timestamp

https://gerrit.wikimedia.org/r/342152

Mentioned in SAL (#wikimedia-operations) [2017-03-10T00:24:38Z] <ppchelko@tin> Started deploy [trending-edits/deploy@a5716b9]: Replayed events are purged based on current timestamp T160136

Mentioned in SAL (#wikimedia-operations) [2017-03-10T00:31:56Z] <ppchelko@tin> Finished deploy [trending-edits/deploy@a5716b9]: Replayed events are purged based on current timestamp T160136 (duration: 07m 17s)

Mentioned in SAL (#wikimedia-operations) [2017-03-10T00:37:09Z] <ppchelko@tin> Started deploy [trending-edits/deploy@a5716b9]: Replayed events are purged based on current timestamp T160136

Mentioned in SAL (#wikimedia-operations) [2017-03-10T00:39:33Z] <ppchelko@tin> Finished deploy [trending-edits/deploy@a5716b9]: Replayed events are purged based on current timestamp T160136 (duration: 02m 23s)

Mentioned in SAL (#wikimedia-operations) [2017-03-10T00:48:22Z] <ppchelko@tin> Started deploy [trending-edits/deploy@1673068]: Replayed events are purged based on current timestamp T160136

Mentioned in SAL (#wikimedia-operations) [2017-03-10T00:54:46Z] <ppchelko@tin> Finished deploy [trending-edits/deploy@1673068]: Replayed events are purged based on current timestamp T160136 (duration: 06m 24s)

Jdlrobson closed this task as "Resolved".Mar 17 2017, 4:55 PM
Jdlrobson claimed this task.

This looks to be working but the speed means many articles get purged after 5 or so hours. See T160127.