As categories change, we need to update the contents of the graph database hosting categories. For this, we need to figure out mechanism for updating those.
Current thinking is:
- Every day, create RDF of updated categories, as SPARQL Update file
- Load it into the blazegraph after it is created.
- This will be done for each wiki that has the functionality enabled.
enwiki seems to have 73662 category updates and 498 category creations on August 19th 2017. Similar numbers show up on other days. This seems to be completely workable number to process daily. Moreover, many category updates will prove on the same categories - seems to be real number of distinct categories update on enwiki is around 25K/day.
On commons, numbers seem to be about 2-3x from this for modifications and about 5x for creations. Still seems to be workable, and commons is probably the upper bound of what we're going to get.