As a maintainer of the rdf-streaming-updater I want information to be logged when divergences are detected on patch application so that I can more easily debug the cause of these divergences.
When applying a RDF patch to the triple store (blazegraph) some divergences may occur for the following reasons:
- the state of the store is not what is expected by the flink pipeline (actual divergences)
- false positives: some triples/literals are modified on the fly by blazegraph (unicode normalization/large values cutoff/precisions). Should be a couple to a dozen triples per hour.
Finding out what's the cause of a non-negligible bump in the number of divergences is not straightforward today, adding some more logs to the streaming consumer will help such investigations.
- meaningful logs allowing to trace what are the changes involved in a bump of divergences