Page MenuHomePhabricator

Investigate removing interdependence between page id and id of mediainfo item in a slot on that page
Closed, ResolvedPublic

Description

If there's a MediaInfo item in a slot on page with id <x>, then the MediaInfo item's id is M<x>, and the MediaInfo code depends on this being the case

Sometimes the page id can change (under certain circumstances during delete/restore, apparently - even though we haven't been able to reproduce it) and that causes a fatal error if the page has a MediaInfo slot, because the ids no longer match

Investigate removing the interdependence of the ids

Event Timeline

Cparle removed Cparle as the assignee of this task.Sep 3 2019, 4:16 PM
Cparle moved this task from Doing to To Do on the Structured-Data-Team-Current-Work board.
Cparle moved this task from To Do to Blocked on the Structured-Data-Team-Current-Work board.

WMDE is thinking about a different solution the parent task atm ...

It seems like the dependency between the page_id and the MediaInfo item id is baked in at a very deep level

For example, here's the call chain for getting the MediaInfo item with id M1234

$entityId = new MediaInfoId( 'M1234' );
$entity =  WikibaseRepo::getDefaultInstance()->getEntityLookup()->getEntity( $entityId );

calls

RevisionBasedEntityLookup::getEntity( $entityId );

calls

WikiPageEntityRevisionLookup.php::getEntityRevision( $entityId );

calls

WikiPageEntityMetaDataLookup.php::loadRevisionInformation( [ $entityId ] );

calls

MediaInfoEntityQuery.php::selectRows() to get rows from the page table, which creates a condition for the query using getConditionForEntityId() which sets 'page_id' => $entityId->getNumericId()

... or in other words, to retrieve the MediaInfo data for an item with id=M1234, it must be stored in a slot on a page with page_id=1234

Or, in other words, the id of the entity id irrelevant - what counts for retrieval is the id of the page

Unpacking this would require a lot of work, and a new way of storing relationships between entity ids and page ids (probably a new db table)

Cparle renamed this task from Remove interdependence between page id and id of mediainfo item in a slot on that page to Investigate removing interdependence between page id and id of mediainfo item in a slot on that page.Sep 5 2019, 2:27 PM

It seems like the dependency between the page_id and the MediaInfo item id is baked in at a very deep level

Yup

... or in other words, to retrieve the MediaInfo data for an item with id=M1234, it must be stored in a slot on a page with page_id=1234

Yup

Or, in other words, the id of the entity id irrelevant - what counts for retrieval is the id of the page

Unpacking this would require a lot of work, and a new way of storing relationships between entity ids and page ids (probably a new db table)

Yup.
It's probably easier / better to go down the route of ensuring pageids remain the same, or perhaps allowing old revisions of an entity to actually have a different entity ID, i'm not sure if that is too evil, especially if the redirects exist for the old entity ids which they should.
"previous entity ids" could also be held within the final entity, but that has the possibility of growing without control.

... or in other words, to retrieve the MediaInfo data for an item with id=M1234, it must be stored in a slot on a page with page_id=1234

I don’t quite see the problem yet, or perhaps I’m interpreting it differently… but as far as I can tell, if you consider the entity ID to change when the page ID changes, that still works.

That'd mean we'd (potentially) have to edit the json blob for slots when a page is deleted/restored

Also - I was under the impression that Wikibase ids were considered immutable

Cparle claimed this task.

The interdependence of the ids can't be changed without some very deep refactoring, and the parent bug is resolved, so resolving this