Maniphest T309987

Mediawiki History delayed 2022-05
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Milimetric
	Jun 6 2022, 2:44 PM

Tags

Referenced Files

None

Subscribers

Description

This month we're having multiple problems with the mw history data pipeline. The sqoop jobs that pull in the source data failed while trying to use views on the Cloud replica. These views were in turn broken due to changes in production.

Once the sqoop problems were fixed late last week, the mw history denormalize job itself failed twice with the cryptic Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: A shuffle map stage with indeterminate output was failed and retried. However, Spark cannot rollback the ShuffleMapStage 986 to re-process the input data, and has to fail this job. Please eliminate the indeterminacy by checkpointing the RDD before repartition and try again.

This task will track problems and their solutions until we get this month's snapshot deployed and all dependent jobs cleared.

Details

	Subject	Repo	Branch	Lines +/-
	Increase resources for history job	analytics/refinery	master	+5 -5
	Update mediawiki history pipeline	analytics/refinery	master	+8 -2

Customize query in gerrit

Related Objects

Mentioned In: T309421: May 2022 Wikimedia movement metrics

Event Timeline

Milimetric created this task.Jun 6 2022, 2:44 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 6 2022, 2:44 PM

Milimetric moved this task from Incoming (new tickets) to Datasets on the Data-Engineering board.Jun 6 2022, 3:17 PM

Mayakp.wiki subscribed.Jun 6 2022, 5:05 PM

Mayakp.wiki mentioned this in T309421: May 2022 Wikimedia movement metrics.Jun 6 2022, 6:59 PM

Milimetric renamed this task from Mediawiki History delayed 2022-06 to Mediawiki History delayed 2022-05.Jun 6 2022, 7:43 PM

Milimetric moved this task from In Progress to Done on the Data-Engineering-Kanban board.Jun 7 2022, 3:12 PM

Milimetric moved this task from Done to In Code Review on the Data-Engineering-Kanban board.

Change 803551 had a related patch set uploaded (by Milimetric; author: Milimetric):

[analytics/refinery@master] Increase resources for history job

https://gerrit.wikimedia.org/r/803551

gerritbot added a project: Patch-For-Review.Jun 7 2022, 3:14 PM

Change 805446 had a related patch set uploaded (by Milimetric; author: Milimetric):

[analytics/refinery@master] Update mediawiki history pipeline

https://gerrit.wikimedia.org/r/805446

Change 805446 merged by Joal:

[analytics/refinery@master] Update mediawiki history pipeline

https://gerrit.wikimedia.org/r/805446

Milimetric moved this task from In Code Review to Ready to Deploy on the Data-Engineering-Kanban board.Jun 15 2022, 1:51 PM

Milimetric moved this task from Ready to Deploy to Done on the Data-Engineering-Kanban board.Jun 28 2022, 5:28 PM

JArguello-WMF closed this task as Resolved.Jul 5 2022, 3:42 PM

Change 803551 abandoned by Milimetric:

[analytics/refinery@master] Increase resources for history job

Reason:

i was sure we deployed this, there must be another change similar

https://gerrit.wikimedia.org/r/803551

Maintenance_bot removed a project: Patch-For-Review.Sep 20 2022, 9:30 AM