Page MenuHomePhabricator

Extract more metrics from blazegraph sparql update response
Closed, ResolvedPublic

Description

As of today when we run an update to blazegraph we only extract the total number of mutations performed.
But the current update process runs between 5 and 7 update queries in one request. Having numbers about how each query performed might be interesting, esp. to answer some of the question asked in T239687.

The goal is to make org.wikidata.query.rdf.tool.rdf.RdfRepository#syncFromChangesNonMerging reports all the known metrics for all the individual update requests we send in this batch.
Because blazegraph only reports HTML for update response something similar to org.wikidata.query.rdf.tool.rdf.client.UpdateCountResponse will have to be added.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
dcausse assigned this task to Zbyszko.
dcausse triaged this task as Medium priority.
dcausse updated the task description. (Show Details)
dcausse added a subscriber: Zbyszko.

Change 557048 had a related patch set uploaded (by ZPapierski; owner: ZPapierski):
[wikidata/query/rdf@master] ResponseHandler that extract every single stat available from Blazegraph response

https://gerrit.wikimedia.org/r/557048

Change 562478 had a related patch set uploaded (by ZPapierski; owner: ZPapierski):
[wikidata/query/rdf@master] Smoke test to ensure query modification safety

https://gerrit.wikimedia.org/r/562478

Change 562478 merged by jenkins-bot:
[wikidata/query/rdf@master] Smoke test to ensure query modification safety

https://gerrit.wikimedia.org/r/562478

Change 557048 merged by jenkins-bot:
[wikidata/query/rdf@master] ResponseHandler that extract every single stat available from Blazegraph response

https://gerrit.wikimedia.org/r/557048