Many Wikipedia's Wikidata module iterate over all entity claims if a Statement is searched for by property label
Open, LowPublic

Description

Yesterday I tried to enable Statement usage tracking on cawiki (which means we're exactly tracking which Statement has been used, and not just that "all entity data" is used). When doing this I discovered that many many usages on cawiki are needlessly added due to a performance bug in their Mòdul:Wikidata (https://ca.wikipedia.org/w/index.php?title=M%C3%B2dul_Discussi%C3%B3:Wikidata&oldid=18938979#Critical_performance_improvement).

The problematic code

		-- otherwise, iterate over all properties, fetch their labels and compare this to the given property name
		for k, v in pairs(entity.claims) do
			if mw.wikibase.label(k) == property then return v end
		end

can easily be replaced with

		property = mw.wikibase.resolvePropertyId(property)
		if not property then return end

		return entity.claims[property]

The problematic code is also on several other wikis: P6114

hoo created this task.Oct 12 2017, 9:03 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 12 2017, 9:03 PM
  • Easy/workaround - When code access pairs of pseudo table (entity.claims as here and also entity.labels and entity.descriptions once T172914 is getting merged ) we should probably workaround it upstream either from UsageAggregator (T178079) or from Lua (whenever access pairs for entity.claims, count it as C.* instead of many rows).
    • wbc_entity_usage will not get overloaded with too many rows. This is just workaround to avoid unintentional usage, as it still make the EU not efficient (from rc side)
  • Medium - Le Tour de Wikí 2017 go over all wikis and fix them (there is no central fix - T121470 T41610)

Change 383990 had a related patch set uploaded (by Eranroz; owner: Eranroz):
[mediawiki/extensions/Wikibase@master] Access to property by name

https://gerrit.wikimedia.org/r/383990

Change 383990 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] getAllStatements and access to property by name

https://gerrit.wikimedia.org/r/383990

This will go live this week, Would be good to measure the impact.

This will go live this week, Would be good to measure the impact.

The above patch (383990) doesn't have any impact. wikis have to adopt it, and this is just a convenient method to avoid improper property usage

Ladsgroup lowered the priority of this task from High to Low.Fri, Feb 2, 2:59 AM

Given that we built T185693: Implement a (more liberal) usage aspect deduplicater (days: 3) it can't blow up the database anymore.