Page MenuHomePhabricator

In PropertyInfoTableBuilder, replace the usage of wb_entity_per_page with using the page and redirect tables
Closed, ResolvedPublic


PropertyInfoTableBuilder uses wb_entity_per_page to list all properties. This can be done using the page table instead, based on the namespace defiend by PRopertyHandler.

Event Timeline

WMDE-leszek lowered the priority of this task from High to Medium.Apr 25 2017, 3:50 PM
WMDE-leszek moved this task from Proposed to Backlog on the Wikidata-Former-Sprint-Board board.

@Ladsgroup want to tackle this? I think it's the last blocker for killing the table!

hmm, sure but how we are going to rewrite the query. It does join between pi_property_id = epp_entity_id and also it does order based on epp_entity_id (integer). It seems this needs a higher level refactor :( unless there is something I'm missing.

@Ladsgroup That joins is indeed problematic. But it's only used to detect "new" properties, which do not yet have an entry in wb_property_info. It's triggered when --rebuid-all is set to false (the default). There are several options:

  1. Remove the --rebuid-all option, and always rebuild all property info entries. There shouldn't be that many properties, so why not.
  2. Load the set of IDs from wb_property_info first, and just skip the known ones in PHP code. A list of a few thousand integers shouldn't be a problem to hold in memory.
  3. change the join to use CONCAT( 'P', pi_property_id ) = page_title. This will trigger a file sort, but a) the query is only run manually, by a maintenance script and b) the set of data is going to be small.

Change 350884 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/extensions/Wikibase@master] Drop use of entity_per_page from PropertyInfoTableBuilder

Change 350884 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Drop use of entity_per_page from PropertyInfoTableBuilder