Page MenuHomePhabricator

skipped item IDs
Open, HighPublic

Description

Nov 2020 update

So I looked at a small gap here:

So the skips don't even appear to be during a time of mass creations?
I think the best way to investigate this would probably to be to add some fairly verbose logging around the entity creation flows in it's own log channel?

Perhaps:

  • When Special:NewItem starts the process of creating a new item?
  • When wbeditentity start the process of creating a new item?
  • When an ID is claimed in the IdGenerator

This should then allow us to progress toward seeing exactly which requests end up generating an ID that end up not completing / saving an entity.

Original report:

After http://www.wikidata.org/wiki/Q50450 there are many skipped IDs. They're not deleted. Next page after Q50450 is Q50464.

What is going on?

Details

Reference
bz42362

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 12:52 AM
bzimport set Reference to bz42362.
bzimport added a subscriber: Unknown Object (MLST).
Merl added a comment.Nov 22 2012, 11:24 PM

Q50514 - Q50567 is missing, too.

It could be a wikidata bug. "If an entry is posted via API and error occurs due to the interwiki conflict, new entry is not created. But new entry number is consumed and NEVER used again". These gap can be found other bot editings.

It's true that IDs may be "swallowed" by unsuccessful attempts to create an item. But I don't see this as a problem. The IDs have no significance, they could just as well be chosen randomly instead of sequentially. The ID system may well be changed to something else entirely at some point, e.g. GUIDs.

I want to thank you for reporting this anyway, because it *might* have been an indication for Something Bad happening, like items just vanishing without a trace. But I don't see any evidence of that. So, closing WONTFIX.

From what I've heard, it's enough to visit Special:NewItem to "consume" an ID.
True, the only damage is that it hurts our milestones addiction. :)

(In reply to comment #4)

From what I've heard, it's enough to visit Special:NewItem to "consume" an
ID.
True, the only damage is that it hurts our milestones addiction. :)

That's not correct. But any attempt to create an entity will consume an ID, including failed attempts.

(In reply to comment #5)

That's not correct.

Ok, let me rephrase that: if that was true, it would really be a bug, and one that should be fixed. But before filing a bug report, please make very sure that IDs are actually being consumed without any attempt to create anything.

It is possible to create through a "visit" to a link, but then the "link" must use a post request and contain a valid edit token for the user, so it must be created through a js-call.

Restricted Application added a project: Wikidata. · View Herald TranscriptJan 28 2016, 6:10 PM
Restricted Application added a subscriber: StudiesWorld. · View Herald Transcript
Bugreporter reopened this task as Open.Dec 3 2019, 5:03 PM
Bugreporter added a subscriber: Bugreporter.

Boldly reopen, ideally no Qid should be "lost" - and no database transaction should be made for failed entity creation.

See another task for actual reason.

Bugreporter reopened this task as Open.Oct 2 2020, 3:09 PM

Reopen as a more general tracking task.

Reminder that this is still a thing:

I maintain a weekly updated page that is tracking the number of skipped item IDs (among other things) at https://www.wikidata.org/wiki/User:MisterSynergy/itemstats. During the past week (diff), the QID counter was increased by 586.750 from Q101579998 to Q102166748. However, 450.082 (76.7%) were skipped QIDs, and only around 130.000 QIDs are actual new items. This means that more than three out of four QIDs were skipped over the period of the past week; it also means that we have increased the number of skipped QIDs by almost 7% within only one week to more than 7 million meanwhile.

I know that we do not run out of QIDs and this seems not actually a very serious issue, but in some sense a densly packed QID space still seems desirable. I thus suggest to start another investigation how this can happen and eventually be mitigated.

Urgh. Yeah, this is not good. I thought we make major headway on this issue with T232620 but apparently, that's not enough. Skipping over three quarters is _bad_. Adding this to camp so we can look at some more of the subtickets @Bugreporter created.
Thanks for creating the stats!

Another idea is to allow user to reclaim Qids that are never used. Don't know how bad it is.

Lydia_Pintscher triaged this task as High priority.Mon, Nov 23, 2:47 PM
Addshore added a comment.EditedTue, Nov 24, 4:46 PM

So I looked at a small gap here:

So the skips don't even appear to be during a time of mass creations?
I think the best way to investigate this would probably to be to add some fairly verbose logging around the entity creation flows in it's own log channel?
Perhaps:

  • When Special:NewItem starts the process of creating a new item?
  • When wbeditentity start the process of creating a new item?
  • When an ID is claimed in the IdGenerator

This should then allow us to progress toward seeing exactly which requests end up generating an ID that end up not completing / saving an entity.

(I probably should have written this on T268625)

Addshore updated the task description. (Show Details)Tue, Nov 24, 4:47 PM
Addshore updated the task description. (Show Details)