Page MenuHomePhabricator

Sqoop will fail on 2022-09-01 unless we fix templatelinks query
Closed, ResolvedPublic3 Estimated Story PointsBUG REPORT

Description

Running the sqoop-whole-mediawiki process will fail for the next few months as the templatelinks table is being changed. The schema is different, so we have to alter the hive table and python logic/sqoop queries so the next run doesn't fail.

This has a hard deadline of 2022-08-30

Along with this change, you'll need to sqoop the linktarget table where the new ids are foreign keys to. Without it, templatelinks isn't very useful. This may mean views on cloud need to be updated and puppet code needs updating to sqoop this new table.

Event Timeline

Sounds like a sub task of T304979 or at least related

Change 821312 had a related patch set uploaded (by Milimetric; author: Milimetric):

[analytics/refinery@master] Adapt to templatelinks schema changes

https://gerrit.wikimedia.org/r/821312

EChetty set the point value for this task to 3.Aug 16 2022, 2:51 PM
EChetty removed the point value for this task.
EChetty set the point value for this task to 3.
EChetty moved this task from Discussed (Radar) to Sprint 00 on the Data Pipelines board.
EChetty edited projects, added Data Pipelines (Sprint 00); removed Data Pipelines.

Dear person claiming this ticket: If you have questions about how to tackle this task, please consult @JAllemandou for guidance.

EChetty raised the priority of this task from High to Unbreak Now!.Aug 24 2022, 2:54 PM

(updating description to show that a new table needs to be sqooped to make sense of templatelinks going forward)

EChetty lowered the priority of this task from Unbreak Now! to High.Aug 25 2022, 9:41 AM

Change 826564 had a related patch set uploaded (by Joal; author: Joal):

[operations/puppet@production] Add linktarget to sqooped tables

https://gerrit.wikimedia.org/r/826564

Change 821312 merged by Joal:

[analytics/refinery@master] Adapt sqoop to templatelinks schema changes

https://gerrit.wikimedia.org/r/821312

Change 826564 merged by Btullis:

[operations/puppet@production] Add linktarget to sqooped tables

https://gerrit.wikimedia.org/r/826564

EChetty moved this task from Ready to Done on the Data Pipelines (Sprint 01) board.