Page MenuHomePhabricator

Clean up lowercase q-IDs from page_props table
Closed, ResolvedPublic

Description

As stated in T172642, there are a bunch of lowercase q-IDs (like q2120834) in page_props tables for various wikis. Correct ID is uppercase (Q2120834) and there may be situations - e.g. when integrating with other data sets - where it can be a problem. It would be nice to clean up such cases and put proper uppercase IDs there.

Event Timeline

Before moving forward with this we should see if it is still an issue.

select count(*) from page_props where pp_propname = 'wikibase_item' and left(pp_value,1) = 'q' limit 1;

I should run this on the analytics cluster for all wikis, but it seems the sql server there is currently down...

Addshore lowered the priority of this task from Medium to Low.
Addshore moved this task from Unsorted 💣 to Back Burner 🏛️ on the User-Addshore board.

I did:

addshore@mwmaint1002:~$ foreachwikiindblist wikidataclient sql.php --query "/*T172748-addshore-wikidataclient*/select * from page_props where pp_propname = 'wikibase_item' and left(pp_value,1) = 'q' lim it 1;" > wikibase_item_page_props_lowercase_q_output

Which told me the follow wikis still have some lower case entity ids in page props

bat_smgwiki
be_x_oldwiki
bewiki
eowiki
hsbwiki
kywiki
map_bmswiki
nds_nlwiki
nlwiki
scnwiki
skwiki
test2wiki
zh_classicalwiki
zh_min_nanwiki
zh_yuewiki
  • bat_smgwiki
  • be_x_oldwiki
  • bewiki
  • eowiki
  • hsbwiki
  • kywiki
  • map_bmswiki
  • nds_nlwiki
  • nlwiki
  • scnwiki
  • skwiki
  • test2wiki
  • zh_classicalwiki
  • zh_min_nanwiki
  • zh_yuewiki

I wrote a bot to touch the pages and reparse them, it fixed around half of them. The rest are either protected or has something that abusefilter or spamfilter blocks. I don't know what to do with those...

Here are the number of rows that we are talking about.

1 @ bewiki
1 @ eowiki
6 @ hsbwiki
1 @ kywiki
2 @ scnwiki
1 @ skwiki
1 @ test2wiki

We could just go through and manually fix these and resolve this.

Fixed them all P8252

  • bewiki
  • eowiki
  • hsbwiki
  • kywiki
  • scnwiki
  • skwiki
  • test2wiki