Page MenuHomePhabricator

Improve parallelism in WDQS updater
Closed, ResolvedPublic

Description

During investigation of update lag on wdqs, we can see:

  • blazegraph updates are in part single threaded (1 CPU at 100% utilization), not much we can do about that
  • During blazegraph udpates, updater just wait synchronously

We can probably increase the parallelism in updater to have better resource utilisation.

Details

Related Gerrit Patches:
operations/puppet : production[wdqs] enable async imports on wdqs1005 and wdqs2001
operations/puppet : production[wdqs] enable asynchronous imports on wdqs1004
operations/puppet : production[wdqs] enable asynchronous imports on wdqs1004
operations/puppet : production[wdqs] add async-import option
wikidata/query/rdf : masterAdd more parallelism to the updater
wikidata/query/rdf : masterCollect more metrics from the Updater

Event Timeline

Gehel created this task.Nov 12 2019, 9:27 AM
Restricted Application added a project: Wikidata. · View Herald TranscriptNov 12 2019, 9:27 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change 551207 had a related patch set uploaded (by DCausse; owner: DCausse):
[wikidata/query/rdf@master] Collect 3 more metrics from the Updater

https://gerrit.wikimedia.org/r/551207

Change 551207 merged by jenkins-bot:
[wikidata/query/rdf@master] Collect more metrics from the Updater

https://gerrit.wikimedia.org/r/551207

Change 552262 had a related patch set uploaded (by DCausse; owner: DCausse):
[wikidata/query/rdf@master] Add more parallelism to the updater

https://gerrit.wikimedia.org/r/552262

Change 552262 merged by jenkins-bot:
[wikidata/query/rdf@master] Add more parallelism to the updater

https://gerrit.wikimedia.org/r/552262

dcausse claimed this task.Nov 25 2019, 2:53 PM
dcausse triaged this task as Medium priority.

Change 552835 had a related patch set uploaded (by DCausse; owner: DCausse):
[operations/puppet@production] [wdqs] add async-import option

https://gerrit.wikimedia.org/r/552835

Change 552836 had a related patch set uploaded (by DCausse; owner: DCausse):
[operations/puppet@production] [wdqs] enable asynchronous imports on wdqs1004

https://gerrit.wikimedia.org/r/552836

Change 552836 merged by Gehel:
[operations/puppet@production] [wdqs] enable asynchronous imports on wdqs1004

https://gerrit.wikimedia.org/r/552836

Change 552835 merged by Gehel:
[operations/puppet@production] [wdqs] add async-import option

https://gerrit.wikimedia.org/r/552835

Change 558526 had a related patch set uploaded (by DCausse; owner: DCausse):
[operations/puppet@production] [wdqs] enable asynchronous imports on wdqs1004

https://gerrit.wikimedia.org/r/558526

Change 558526 merged by Gehel:
[operations/puppet@production] [wdqs] enable asynchronous imports on wdqs1004

https://gerrit.wikimedia.org/r/558526

Change 559847 had a related patch set uploaded (by DCausse; owner: DCausse):
[operations/puppet@production] [wdqs] enable async imports on wdqs1005 and wdqs2001

https://gerrit.wikimedia.org/r/559847

Change 559847 abandoned by DCausse:
[wdqs] enable async imports on wdqs1005 and wdqs2001

Reason:
superseded by I898c5ddfaf15a34495316bf45dbda47e431e4877

https://gerrit.wikimedia.org/r/559847

Possibly relevant comment here: I believe there is a plan also to move to incremental updates (updating only the statements/triples that have changed) so it is probably important that any parallelism in updating be coordinated so that updates for the same item (Q value) be grouped together and done in the same process, so they don't clobber one another. Updates for separate items (different Q values) can be handled in parallel as the associated RDF triples are independent (the subject of a triple is always the item, a statement on the item, or a further node derived from the item). Even without that incremental update process, grouping updates on the same item together could be a significant speed boost, as 5 updates for Q9999 can be collapsed into just the last update under the current procedure of completely rewriting the triples.

Gehel closed this task as Resolved.Feb 26 2020, 4:14 PM