Page MenuHomePhabricator

SPARQL shows redirect with original data months after merge
Closed, ResolvedPublic3 Estimated Story PointsBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

  1. Merge item on September 2024 (8 months ago).
  2. Query for items by coords: https://w.wiki/E322

What happens?:

The query shows both items (higher Q is a redirect).

What should have happened instead?:

Redirects should be purged from the graph and not be part of any SPARQL... or at least they should have some status.

Software version (on Special:Version page; skip for WMF-hosted wikis like Wikipedia):
WD

Other information (browser name/version, screenshots, etc.):

obraz.png (297×789 px, 24 KB)

Event Timeline

Aklapper renamed this task from SPARQL shows redirct with original data months after merge to SPARQL shows redirect with original data months after merge.May 8 2025, 5:28 PM
Gehel subscribed.

The data / logs are probably too old so a full investigation on the causes of the issue is unlikely to be successful. We should still reload that item to fix the data drift.

Gehel set the point value for this task to 3.

There is also a 2nd one in Poland if that helps:
https://www.wikidata.org/w/index.php?title=Q122261336&action=history

Though that's around same time, so I guess it won't help that much.

Thanks for the report, I see two stale revisions for these items indeed:

select * {
  VALUES ?stale_revision {1962416023 1969436471}
  ?item schema:version ?stale_revision
}

returning:

item            stale_revision
wd:Q121884648	1962416023
wd:Q122261336	1969436471

From the revision-create stream I can't find any events related to

  • Q122261336 on 2024-08-27
  • Q121884648 on 2024-09-11

Both

select * from mediawiki_revision_create where datacenter in ('eqiad', 'codfw') and year = 2024 and month=8 and day=27 and page_title='Q122261336'

and

select * from mediawiki_revision_create where datacenter in ('eqiad', 'codfw') and year = 2024 and month=9 and day=11 and page_title='Q121884648'

return no result in the event_sanitized database.

I suspect that the event was missed...
Missing events is sadly a known issue for which there's no consensus yet on how to best mitigate those.

I reconciled these two items to fix the state.