Page MenuHomePhabricator

Update mediawiki-history subgraph-partitioner so that it uses [page/user]_id in addition to title/text
Open, MediumPublic

Description

We have page_id in log events since ~2014. subgraph-partitioning helps rebuilding history when page_id is not present, but can also break lineage in case of data-inconsistency (events with the same page_id end up in different subgraph because a move-event has been lost). There should be a way to solve this, either through hierarchical-graph-partitioning, or by using a 2 steps job (separate events with a page_id from those without and apply subgraph only on those without).
Note: This task is created after having been mentioned in T213603.