Page MenuHomePhabricator

Provide a way to list edits affecting a specific query
Closed, DeclinedPublic

Description

If a list is generated manually, it's easy to track changes of the list by "view history" function. But if the list is generated automatically, it's not easy to do so.

If there were only triple additions (not removals), it's easy to track changes in this way:

  1. Create a new table listing all RDF triples and the revision which some triple is added.
  2. When querying edits affecting a specific query, we may found all related triples and related revisions. Then we remove any duplicate revisions and sort the list of revisions by date. We got the result we needed.

Open question:

  1. How to deal with triple removals? Should we create a new table for all former triples? and how to query them?
  2. How to integrate it and traditional article history?

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Frankly, I have no idea how could we do this, because each edit adds and removes lots of triples. I suspect it won't scale, at least with present hw and database, but even if it did, that would require keeping shadow database with all triples since the beginning of history. It will also make querying harder as we only should consider the triples that have "actual", not "historic" revisions. I'm not sure Blazegraph is really suited for such thing.

Smalyshev moved this task from Incoming to Need investigation on the Wikidata-Query-Service board.
Gehel subscribed.

This sounds like a hugely complex problem to address. It sounds like the classical cache invalidation / propagation with a bit of a termination problem. It is extremely unlikely that we will ever be able to address something of that scale of complexity.