Page MenuHomePhabricator

The streaming updater consumer should log information when divergences are detected
Open, Needs TriagePublic1 Estimated Story Points

Description

As a maintainer of the rdf-streaming-updater I want information to be logged when divergences are detected on patch application so that I can more easily debug the cause of these divergences.

When applying a RDF patch to the triple store (blazegraph) some divergences may occur for the following reasons:

  • the state of the store is not what is expected by the flink pipeline (actual divergences)
  • false positives: some triples/literals are modified on the fly by blazegraph (unicode normalization/large values cutoff/precisions). Should be a couple to a dozen triples per hour.

Finding out what's the cause of a non-negligible bump in the number of divergences is not straightforward today, adding some more logs to the streaming consumer will help such investigations.

AC:

  • meaningful logs allowing to trace what are the changes involved in a bump of divergences