Missing updates due to recentchanges backfilling
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Jheald
	Mar 24 2017, 9:08 PM

Description

I uploaded coordinates (P625) for Maldon (Q26272025) an hour and a half ago (diff).

But when I use WDQS to ASK whether it has coordinates (query), the answer is coming back FALSE -- even though WDQS says it was updated 30 seconds ago.

Similarly with ten other sets of coordinates I added at the same time. I can't spot anything wrong with my query, so is there something wrong with the database?

Details

	Subject	Repo	Branch	Lines +/-
	Enable "trailing poller" functionality for production.	operations/puppet	production	+1 -1

Customize query in gerrit

Related Objects

Mentioned In: T161731: Create reliable change stream for specific wiki

Event Timeline

Jheald created this task.Mar 24 2017, 9:08 PM

Restricted Application added projects: Wikidata, Discovery-ARCHIVED. · View Herald TranscriptMar 24 2017, 9:08 PM

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Looks like something weird is going on with recentchanges stream. Looking at the relevant fragment:

{
type: "edit",
ns: 0,
title: "Q29002776",
pageid: 30662598,
revid: 469529345,
old_revid: 469529329,
rcid: 498001207,
timestamp: "2017-03-22T19:14:14Z"
},
{
type: "edit",
ns: 0,
title: "Q26272025",
pageid: 28223090,
revid: 469529353,
old_revid: 468296753,
rcid: 498001280,
timestamp: "2017-03-22T19:14:14Z"
},
{
type: "edit",
ns: 0,
title: "Q29002776",
pageid: 30662598,
revid: 469529346,
old_revid: 469529345,
rcid: 498001208,
timestamp: "2017-03-22T19:14:15Z"
},

We can see that the entry for Q26272025 has rcid 498001280, which means it was inserted much later than Q29002776 entries, even though the timestamp is the same. Looking at this:

{
type: "edit",
ns: 0,
title: "Q29002778",
pageid: 30662600,
revid: 469529418,
old_revid: 469529412,
rcid: 498001279,
timestamp: "2017-03-22T19:14:27Z"
}

it appears it has been inserted full 13 seconds later than its timestamp suggests. WDQS updater currently uses backoff time of 10 seconds, which explains how it missed the update - by the time it was processing rcstream for that time period, the record for Q26272025 did not exist yet, and by the time it was backpatched in, the updater was too far ahead to check back at this time window.

The problem here is that the only real way of reading recentchanges stream for now is by timestamp. And turns out entries can be backpatched into it basically without time limit, and making too wide window makes it impossible to work for the updater, since it can only load maximum of 500 entries apiece.

So we either need to find a way to reliably get updates without missing stuff or switch to some API that is not based on recentchanges stream.

Ouch. That sounds quite nasty. The only thing I can think of from the user side that was perhaps slightly different about this set of edits was they were made with QuickStatements while I already had a different QuickStatements run open and going in another window (a big batch of 10,000 corrections in capitalization of item labels, which took about two and a half hours to run, all of which went through fine).

So conceivably could that have perhaps delayed some handshake in the writing of the coordinates statements? I don't have anything solid to support that as an idea, it's just a complete guess.

I suspect QS might be triggering it since I've had other instances of the same problem with QS edits. However, it's not QS' fault here as I see it, it's the combination of recentchanges preferred way being to fetch by time coupled with complete unreliability of its timeline.

The way to fix it now is to edit the entry again or to manually update the entries (I can do it if you give me the list of IDs). But I'll be working on finding a better solution for it, I though 10 sec backoff would be enough but clearly it's not.

Thanks for your speedy diagnosis. I've gone through and reverted and then un-reverted the ten edits by hand, so they are now fine.

Hope you can get a bit more rigour instilled into the recentchanges logging.

Smalyshev renamed this task from Some data not uploaded to WDQS ? to Missing updates due to recentchanges backfilling.Mar 29 2017, 6:15 PM

Smalyshev triaged this task as Medium priority.

Smalyshev mentioned this in T161731: Create reliable change stream for specific wiki.Mar 29 2017, 6:25 PM

Smalyshev moved this task from Incoming to Current work on the Wikidata-Query-Service board.Mar 30 2017, 1:13 AM

Smalyshev claimed this task.Apr 11 2017, 5:27 AM

Change 347565 had a related patch set uploaded (by Smalyshev):
[operations/puppet@production] Enable "trailing poller" functionality for production.

https://gerrit.wikimedia.org/r/347565

gerritbot added a project: Patch-For-Review.Apr 11 2017, 5:29 AM

Change 347565 merged by Gehel:
[operations/puppet@production] Enable "trailing poller" functionality for production.

https://gerrit.wikimedia.org/r/347565

Smalyshev added a project: Discovery-Wikidata-Query-Service-Sprint.Apr 26 2017, 11:30 PM

Smalyshev moved this task from Backlog to In progress on the Discovery-Wikidata-Query-Service-Sprint board.

Lydia_Pintscher moved this task from incoming to monitoring on the Wikidata board.May 5 2017, 2:34 PM

Smalyshev moved this task from In progress to Done on the Discovery-Wikidata-Query-Service-Sprint board.May 9 2017, 9:01 PM

Trailing poller should fix it for now, if it's still not enough please reopen with newly affected IDs.

Missing updates due to recentchanges backfillingClosed, ResolvedPublicActions

Description

Details

Related Objects

Event Timeline

Missing updates due to recentchanges backfilling
Closed, ResolvedPublic
Actions