Page MenuHomePhabricator

ipoid: Identify import process bottlenecks
Closed, DeclinedPublic

Description

Summary

The time it takes for ipoid to complete its daily imports of the latest Spur data has recently increased—it now takes more than a day to finish the import.

We should investigate why this is so and identify any potential bottlenecks.

Technical notes

There's a fair amount of preexisting node.js profiling tooling we could leverage, e.g. 0x or perf directly.
It'd probably be easiest to start with trying to replicate the issue locally, and if that fails, consult with SRE to see whether we could obtain profiles for a production import.

Acceptance criteria

  • The ipoid import process has been profiled and potential bottlenecks have been identified.

Event Timeline

Some notes for whoever has time to investigate this more:

  • We do our best to maintain parity with the day's feed and keep no historical data so if feeds got bigger, it'll take more time to process them since batches have a sleep in between them
    • Just something I'm curious about but do we have any stale data lingering in the table? We shouldn't but a check of last_updated in the actor table should confirm this fairly quickly.
  • I think grafana data only lasts 90 days? But we have a table import_status that has logs of all of our imports so we could look that and answer questions like "did feeds get bigger?" (more batches per import)
  • @Tchanders mentioned at one point we shifted from async to sync processing. I have less of a recollection but it may be around synchronous processing of batches which I think we would have done to ensure that when we split the sql commands into batches, we weren't running dependent queries (eg. delete/re-insert) out of order. We should investigate/revisit that decision.