Page MenuHomePhabricator

EventStream (page-links-change) is not accurate
Open, Needs TriagePublicBUG REPORT

Description

Example Case on tawiki:

rev_id = 3354837
diff = https://ta.wikipedia.org/w/index.php?diff=3354837
dt = 2022-03-15T22:18:08Z

Full record:

{"$schema":"/mediawiki/page/links-change/1.0.0","meta":{"uri":"https://ta.wikipedia.org/wiki/%E0%AE%8E%E0%AE%A9%E0%AF%8D%E0%AE%B1%E0%AE%BF_%E0%AE%B2%E0%AE%BE%E0%AE%B0%E0%AE%A9%E0%AF%8D%E0%AE%9A%E0%AF%81_%E0%AE%A4%E0%AF%80%E0%AE%B5%E0%AF%81","request_id":"9dad5393-c2cb-48f8-9539-0b8175255392","id":"703c2fe6-ea21-4518-83de-de3005b88571","dt":"2022-03-15T22:18:08Z","domain":"ta.wikipedia.org","stream":"mediawiki.page-links-change","topic":"eqiad.mediawiki.page-links-change","partition":0,"offset":1534098987},"database":"tawiki","page_id":435849,"page_title":"?????_???????_????","page_namespace":0,"page_is_redirect":false,"rev_id":3354837,"performer":{"user_text":"InternetArchiveBot","user_groups":["bot","*","user","autoconfirmed"],"user_is_bot":true,"user_id":182654,"user_registration_dt":"2020-09-11T00:35:24Z","user_edit_count":81405},"added_links":[{"link":"/wiki/%25E0%25AE%2589%25E0%25AE%25A4%25E0%25AE%25B5%25E0%25AE%25BF:CS1_errors","external":false},{"link":"/wiki/%25E0%25AE%25AA%25E0%25AE%2595%25E0%25AF%2581%25E0%25AE%25AA%25E0%25AF%258D%25E0%25AE%25AA%25E0%25AF%2581:CS1_maint:_BOT:_original-url_status_unknown","external":false},{"link":"https://web.archive.org/web/20191017043815/http://allcodesindia.in/stdcode/andaman%2Band%2Bnicobar.php","external":true},{"link":"http://allcodesindia.in/stdcode/andaman%2Band%2Bnicobar.php","external":true},{"link":"https://web.archive.org/web/20170828015509/http://andssw1.and.nic.in/ecostat/basicstatPDF2013_14/1.Demogrpahy.pdf","external":true},{"link":"http://andssw1.and.nic.in/ecostat/basicstatPDF2013_14/1.Demogrpahy.pdf","external":true},{"link":"https://web.archive.org/web/20140323224041/http://www.india-codes.com/pincodes/a-n-islands-pin-code","external":true},{"link":"http://www.censusindia.gov.in/2011-VillageDirectory/Directory/short_code_rural_35.pdf","external":true}]}

Problem 1 = The date of the revision (Dec 29, 2021) does not match the date of the stream record (03-15-2022). Is there a way to know that revision 3354837 was made months ago even though EvenStream reported it months later? I am also seeing duplicates where ES reported it correctly on the date it occurred then reports it again months later. As was the case here.

Problem 2 = The links in the diff do not fully match the links in the stream record.