Page MenuHomePhabricator

StateExtractionJob is too slow
Closed, DeclinedPublic5 Estimated Story Points

Description

As a user, when there is WDQS streaming updater maintenance I want less than 10 min of lag.

As a maintainer of the wdqs streaming updater I want the StateExtractionJob to run in a reasonable amount of time so that I don't have to downtime the flink streaming updater application for too long.

As of today running the state extraction job on an existing savepoint takes 6hours, this is way too slow and something is probably going wrong. Writing a similar size savepoint from CSV files take less than one minute.

AC:

  • StateExtractionJob runs in less than 10minutes for wikidata

Event Timeline

MPhamWMF triaged this task as High priority.Jun 7 2021, 3:45 PM
MPhamWMF moved this task from Incoming to Scaling on the Wikidata-Query-Service board.
Gehel set the point value for this task to 5.Aug 30 2021, 3:50 PM
dcausse moved this task from incoming to in progress on the Wikidata board.
dcausse moved this task from in progress to incoming on the Wikidata board.
Gehel subscribed.

Known upstream issue, nothing left to do on our side. Hopefully this will be improved in a future Flink release.