Page MenuHomePhabricator

Lowercase QIDs returned by wikibase:mwapi
Closed, ResolvedPublic

Description

Hadn't noticed this before or only happens with nlwiki. This query returns

wd:q2120834
wd:q1838936

instead of

wd:Q2120834
wd:Q1838936

Other results seem fine.

Event Timeline

Restricted Application added subscribers: PokestarFan, Aklapper. · View Herald Transcript
Smalyshev triaged this task as Medium priority.Aug 6 2017, 4:55 PM

This is weird. I suspect it's how it is stored in the DB (maybe some buggy bot or extension?) but will check on Monday.

Looking at it closer, I also found some on eswiki:

wd:q11704619
wd:q5884721

Still, it's fairly rare.

mysql:wikiadmin@db1090 [nlwiki]> select * from page_props where pp_page = '1572528';
+---------+---------------+----------+------------+
| pp_page | pp_propname   | pp_value | pp_sortkey |
+---------+---------------+----------+------------+
| 1572528 | wikibase_item | q2120834 |       NULL |
+---------+---------------+----------+------------+

Looks like it's a database issue. I'll try to figure out where it comes from.

Looks like this diff: https://nl.wikipedia.org/w/index.php?title=Coupe_Manier&diff=36039023&oldid=15798333

has it lowercase:

Robot: Verplaatsing van 1 interwikilinks. Deze staan nu op Wikidata onder d:q2120834)

Maybe the code of Addbot needs to be checked.

I think Addbot just read the same table as you.

Maybe it's just a really old entry: The edit for nl at Wikidata is from 2012.

Maybe it's an old issue and we should just clean up the tables? I'd like a word from @Addshore on this though to be sure.

A couple of hundreds or none for a few other wikis.

At least that explains it why I noticed it for nlwiki.

Maybe just a lot of pages that haven't been edited recently.

OK, looks like there are several issues at play here:

  1. Old edits have lowercase q-IDs. New code doesn't make them but old one does. This can be probably fixed by running some kind of script to fix it.
  2. Pageprops API reports such IDs as-is, and this probably can't be changed without bad hacks that we don't want.
  3. WDQS gets data from pageprops API, but it could convert them to uppercase when creating URLs in apiOutputItem - there we know it should be a valid ID.

Change 370593 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[wikidata/query/rdf@master] Uppercase item ids received from services.

https://gerrit.wikimedia.org/r/370593

Change 370593 merged by jenkins-bot:
[wikidata/query/rdf@master] Uppercase item ids received from services.

https://gerrit.wikimedia.org/r/370593

Smalyshev claimed this task.

Thanks. It works. For the above and also some others.