Page MenuHomePhabricator

Some items are in an inconsistent state
Closed, DuplicatePublic

Description

I was exploring wdt:P594 when I realized that queries like the following one do not return a stable result but they vary between true and false (probably depending on the server(s) they hit):

ASK WHERE
{
  wd:Q18047295 wdt:P594 ?ensembl.
  wd:Q18047295 p:P594 ?statement .
  ?statement ps:P594 ?ensembl .
}

It's pretty weird, since that should be the standard Wikidata data pattern. Seems like it's happening again the issue described in T112397

Edit: @Gstupp fixed wd:Q18047295 with a manual editing (see here: https://github.com/SuLab/scheduled-bots/issues/22)

Other possible affected items:

SELECT * WHERE
{
  ?item wdt:P594 ?ensembl .
  FILTER NOT EXISTS {
    ?item p:P594 ?statement .
    ?statement ps:P594 ?ensembl . 
  }
}

A couple of questions:

  • Is it a general inconsistent state of Wikidata or is it happening just for a subset of items (e.g., those updated with bots)?
  • Is there a way to detect, to fix and - even better - to avoid those inconsistencies?

Thank you guys!

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 22 2018, 5:40 PM
SELECT * WHERE
{
  ?item wdt:P594 ?ensembl .
  FILTER NOT EXISTS {
    ?item p:P594 ?statement .
    ?statement ps:P594 ?ensembl . 
  }
}

In general, this query is satisfiable without any bugs, since wdt: only selects truthy statements. Add ?statement a wikibase:BestRank to make the full-statement version more equivalent to the truthy-triple version.

May be caused by lag problems on wdq1003 or by the same reasons as T203646.

Smalyshev updated the task description. (Show Details)Oct 23 2018, 6:54 AM
Floatingpurr updated the task description. (Show Details)Oct 23 2018, 9:07 AM
Floatingpurr updated the task description. (Show Details)

@LucasWerkmeister, if I am not mistaken, there is no preferred rank for items with wdt:P594, therefore we should ever see that pattern.

@Smalyshev, in these cases, is there a way for fixing?

Floatingpurr triaged this task as Normal priority.Oct 23 2018, 9:22 AM
Floatingpurr updated the task description. (Show Details)
Tarrow added a subscriber: Tarrow.Oct 23 2018, 9:33 AM

Perhaps it could also be related to updates that ran while there were issues from: T206743 ?

Something similar seems happening for this query:

SELECT distinct ?item ?itemLabel
WHERE 
{
  ?item wdt:P594 ?ensg .
  ?item wdt:P31|wdt:P279 wd:Q8054 .
  FILTER NOT EXISTS {?item wdt:P31|wdt:P279 wd:Q7187}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

Try it!

It still returns 3 items I've just modified that shouldn't be there. Seems like we are experiencing stale data. Another case of inconsistent updates?

Jane023 added a subscriber: Jane023.

This appears to be the same issue I reported here. Hope this is just a question of waiting a few days for servers to be in sync, and not months for a bug fix.

I'm afraid it is not a question of time. Probably something odd is/was going on, but I have no clues.

@Floatingpurr:

Something similar seems happening for this query:

This query produces nothing for me. Is that expected result?

Hey @Smalyshev, yes that is the expected result. When I tried that query the first time, I got 3 items but they shouldn't be there. I thought it was another effect of bad data sync. Now that query seems working fine. Unfortunately, this one keeps returning almost 700 items:

SELECT * WHERE
{
  ?item wdt:P594 ?ensembl .
  FILTER NOT EXISTS {
    ?item p:P594 ?statement .
    ?statement ps:P594 ?ensembl . 
  }
}

Try it!

@Floatingpurr This is weird indeed, looks like some data is missing. Not sure why it happened, I'll research.

This is my query which does not return all items:

SELECT ?item ?catcode WHERE { 
  ?item p:P528 [ pq:P972 wd:Q53207781 ; ps:P528 ?catcode]. 
}

Example missing item: Q31158443. This item was created using the duplicate item gadget, so not sure if that matters.

EBjune added a subscriber: EBjune.Nov 20 2018, 3:37 PM

@Jane023 that item shows up when I run that query, can you run it again and see if it's still missing? There could have been a sync issue that has resolved at this point.

No for me it doesn't show up. Also the query should return 74 items and it only returns 69 for me. The same problem exists for the same Q when you run the query against Q58590616 instead: this time I get 78 results when it should be 80. So there seem to be a small group of problematic items that don't get counted each time the query runs.

that is strange, I ran it again and Q31158443 is on my list of results, but it is only returning 71 results for me. I'll leave it to @Smalyshev to dig into it further

@Jane023, @EBjune FYI: 70 rows here (with Q31158443)

Smalyshev moved this task from Backlog to Doing on the User-Smalyshev board.Nov 20 2018, 10:11 PM