Full vertical slice of the system is successfully implemented.
Description
Description
Related Objects
Related Objects
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | None | T346147 Generate XML dumps for simplewiki | |||
Resolved | Milimetric | T335862 Implement job to generate Dump XML files | |||
Resolved | VirginiaPoundstone | T344690 [Spike] Quantify pages and revisions as relevant to dumps | |||
Resolved | Milimetric | T344691 [Spike] Understand how "large" pages (with lots of revisions) are problematic when writing XML to Hadoop | |||
Resolved | Milimetric | T344693 Understand Hadoop OutputFormat and how to solve the problem | |||
Resolved | Milimetric | T346378 Update XML dump generation code to use wmf_dumps.wikitext_raw_rc1 schema. |
Event Timeline
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 12 2023, 1:31 PM2023-09-12 13:31:47 (UTC+0)
VirginiaPoundstone triaged this task as Medium priority.Sep 12 2023, 1:48 PM2023-09-12 13:48:17 (UTC+0)
VirginiaPoundstone moved this task from Sprint Goals to Wormhole To Sprint 02 on the Data Products (Sprint 01) board.Oct 3 2023, 1:47 PM2023-10-03 13:47:16 (UTC+0)
VirginiaPoundstone edited projects, added Data Products (Sprint 02); removed Data Products (Sprint 01).
WDoranWMF moved this task from Backlog to In Process on the Data Products (Sprint 02) board.Oct 3 2023, 1:52 PM2023-10-03 13:52:30 (UTC+0)
VirginiaPoundstone raised the priority of this task from Medium to High.Oct 17 2023, 12:26 PM2023-10-17 12:26:00 (UTC+0)
WDoranWMF moved this task from Sprint Backlog to Teleport to Sprint 03 on the Data Products (Sprint 02) board.Oct 24 2023, 12:56 PM2023-10-24 12:56:13 (UTC+0)
WDoranWMF edited projects, added Data Products (Data Products (Sprint 03)); removed Data Products (Sprint 02).
VirginiaPoundstone lowered the priority of this task from High to Low.Oct 30 2023, 7:40 PM2023-10-30 19:40:07 (UTC+0)
WDoranWMF moved this task from Sprint Backlog to Wormhole to Sprint 04 on the Data Products (Data Products (Sprint 03)) board.Nov 16 2023, 1:38 PM2023-11-16 13:38:18 (UTC+0)
WDoranWMF edited projects, added Data Products (Data Products Sprint 04); removed Data Products (Data Products (Sprint 03)).
xcollazo closed subtask T335862: Implement job to generate Dump XML files as Resolved.Nov 17 2023, 3:45 PM2023-11-17 15:45:51 (UTC+0)
WDoranWMF moved this task from Sprint Backlog to Portal to Sprint 05 on the Data Products (Data Products Sprint 04) board.Dec 11 2023, 5:06 PM2023-12-11 17:06:20 (UTC+0)
WDoranWMF edited projects, added Data Products (Data Products Sprint 05); removed Data Products (Data Products Sprint 04).
WDoranWMF moved this task from Sprint Backlog to Back to main backlog on the Data Products (Data Products Sprint 05) board.Jan 9 2024, 6:23 PM2024-01-09 18:23:00 (UTC+0)
VirginiaPoundstone moved this task from Incoming to To be discussed on the Data Products board.Jan 10 2024, 4:48 PM2024-01-10 16:48:34 (UTC+0)
VirginiaPoundstone moved this task from To be discussed to Wikistats Backlog on the Data Products board.Mar 18 2024, 7:52 PM2024-03-18 19:52:04 (UTC+0)
VirginiaPoundstone moved this task from Wikistats Backlog to To be discussed on the Data Products board.Mar 19 2024, 12:19 AM2024-03-19 00:19:11 (UTC+0)
VirginiaPoundstone moved this task from To be discussed to Wikistats Backlog on the Data Products board.Mar 25 2024, 4:51 PM2024-03-25 16:51:27 (UTC+0)
VirginiaPoundstone moved this task from Wikistats Backlog to Pipelines Backlog on the Data Products board.Mar 29 2024, 5:04 PM2024-03-29 17:04:56 (UTC+0)
VirginiaPoundstone moved this task from Incoming to To be discussed/To be estimated on the Dumps 2.0 board.May 14 2024, 1:36 PM2024-05-14 13:36:24 (UTC+0)
lbowmaker moved this task from To be discussed/To be estimated to Backlog on the Dumps 2.0 board.Mon, Jun 10, 6:18 PM2024-06-10 18:18:15 (UTC+0)