Page MenuHomePhabricator

Monthly Mediawiki Sqoop job failed
Closed, ResolvedPublic5 Estimated Story Points

Description

During this month's run, some tables were not sqooped properly from mediawiki dbs into hdfs. The code that was responsible for listing which tables these were failed to run due to a simple bug and therefore we have to figure out what ran based on absence of _SUCCESS flags in the 2017-07 snapshot. This task has two parts:

  • fix the python bug
  • figure out what jobs didn't run, and restart the sqoop

Event Timeline

Change 369988 had a related patch set uploaded (by Milimetric; owner: Milimetric):
[analytics/refinery@master] Fix failure in failed_jobs handling

https://gerrit.wikimedia.org/r/369988

Change 369988 merged by Ottomata:
[analytics/refinery@master] Fix failure in failed_jobs handling

https://gerrit.wikimedia.org/r/369988

The jobs that didn't run were identified in T165233#3498662 and the error seems to have to do with database access, so for now I'm moving this to done and deploying the new version of the sqoop script. Since I don't think it makes sense to wait and sqoop the remaining wikis, I'm going to manually add the success flags in order to trigger the snapshot for this month.

Milimetric triaged this task as Medium priority.Aug 4 2017, 3:06 PM
Milimetric moved this task from Ready to Deploy to Done on the Analytics-Kanban board.
Nuria set the point value for this task to 5.Aug 31 2017, 4:15 PM