Page MenuHomePhabricator

Estimate cirrus streaming updater's usage of MWAPI
Closed, InvalidPublic

Description

The current Search Update pipeline [FIXME:link to diagram] uses about 1/2 of the capacity of the JobRunners . When we deploy the new Search Update Pipeline, we'll be hitting the mw-api-int-async-ro API instead. Per T342252 , cirrus-streaming updater will access the mwapi through mw-on-k8s. As mentioned in the linked ticket, we need to estimate the number of rps the cirrus streaming updater will use. (See this dashboard for an example of a similar flink app causing 20x RPS increase during its backfill).

Creating this ticket to:

  • Estimate the number of RPS the streaming updater will use.
  • Determine if there are other important usage metrics (payload size?) and estimate these as well.

Event Timeline

Met with @pfischer today about this topic. He pointed out that we're doing a phased rollout (gradually increasing the number of wikis) , so we'll be able to estimate our resource usage much more clearly then.

As such, I'm removing the "current work" tag and we'll revisit this as we get closer to rollout.

Gehel triaged this task as High priority.Nov 3 2023, 10:29 AM
Gehel moved this task from Incoming to Ready for Work on the Data-Platform-SRE board.
bking moved this task from Ready for Work to Done on the Data-Platform-SRE board.

Upon further review, I'm declining this as invalid. We do need to track resource usage, but that shouldn't be limited to MWAPI. Other resource-related issues will turn up as we roll out in staging (T347075) and test a backfill (T350826) , so we can add new subtasks off these tickets as needed.