Page MenuHomePhabricator

Issues in the dumps → mediawiki wikitext history → content gap metrics pipeline can significantly delay the movement metrics report
Open, Needs TriagePublic

Description

This task is primarily intended for documenting how Movement-Metrics is affected by problems with Dumps-Generation and mediawiki_wikitext_history. For that reason, it is not tagged with Data-Platform or Dumps-Generation.

SDS 2.6.2 (FY2023-24) has been focused on improving the delivery of the movement metrics report. Our critical path is as follows:

Before T357859, the average duration was 26 days. Afterward, the average duration has been 17 days.

data intervaldays to availability of knowledge gapsnotes
2023-0923.11 day delay due to T342911
2023-1025.5
2023-1127.94 day delay due to T342911
2023-1226.12 day delay due to T342911
2024-0126.91 day delay due to T342911, knowledge gaps job issue (T358613)
2024-0210.6First run skipping Wikidata to save time (T357859), 1 day delay due to T342911
2024-0318.7Dumps generation issue, ultimately resolved by skipping Commons (T362454), 1 day delay to T342911
2024-0414.1Dumps generation issue (T364391)
2024-0523.8Major dumps generation issue (T365155)

(raw data in spreadsheet)

Event Timeline

nshahquinn-wmf renamed this task from Dumps and mediawiki_wikitext_history issues can significantly delay the movement metrics report to Issues in the dumps → mediawiki wikitext history → content gap metrics pipeline can significantly delay the movement metrics report.May 20 2024, 8:08 PM
nshahquinn-wmf updated the task description. (Show Details)