JsonRevisionsSortedPerPage failed on enwiki-20150901-pages-meta-history [13 pts] {paon}
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Halfak
	Oct 1 2015, 12:38 PM

Description

job_1442877556644_0009 on the Wikimedia altiscale cluster

Related Objects

Mentioned In: T99172: Historical analysis of edit productivity for English Wikipedia

Event Timeline

Halfak created this task.Oct 1 2015, 12:38 PM

Halfak raised the priority of this task from to Needs Triage.

Halfak updated the task description. (Show Details)

Halfak added a project: Analytics-Backlog.

Halfak moved this task to Incoming on the Analytics-Backlog board.

Halfak subscribed.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 1 2015, 12:38 PM

Looks like we were running out of memory in the reducer. The job took nearly 48 hours to arrive at this failed state.

Job Name: 	org.wikimedia.wikihadoop.job.JsonRevisionsSortedPerPage$: MediaWikiRevisionXMLToJSONInputFormat(/user/halfak/stream... ID=1 (1/1)
User Name: 	halfak
Queue: 	default
State: 	FAILED
Uberized: 	false
Submitted: 	Tue Sep 29 23:08:31 UTC 2015
Started: 	Tue Sep 29 23:08:39 UTC 2015
Finished: 	Thu Oct 01 11:43:25 UTC 2015
Elapsed: 	36hrs, 34mins, 45sec
Diagnostics: 	
Task failed task_1442877556644_0009_r_000274
Job failed as tasks failed. failedMaps:0 failedReduces:1
Average Map Time 	29mins, 6sec
Average Reduce Time 	41mins, 50sec
Average Shuffle Time 	17mins, 15sec
Average Merge Time 	4sec

Here's the command I ran:

hadoop jar ~/jars/wikihadoop-0.2.jar \
  org.wikimedia.wikihadoop.job.JsonRevisionsSortedPerPage \
  -i /user/halfak/streaming/enwiki-20150901/xml-bz2 \
  -o /user/halfak/streaming/enwiki-20150901/revdocs-bz2 \
  -r 2000

Halfak mentioned this in T99172: Historical analysis of edit productivity for English Wikipedia.Oct 1 2015, 3:06 PM

Looked at the logs: Seemed to be an interuption exception.
If so, there are chances that the issue comes from timeout.
There is a parameter that can be changed in the job (with a typo ...) that defaults to 1800000 (1/2h) --> can be changed to 3600000 (1h).
Also, the number of reducers could be set up a bit (2000 is not that big).
I'd like to see if the following run works:

hadoop jar ~/jars/wikihadoop-0.2.jar \
  org.wikimedia.wikihadoop.job.JsonRevisionsSortedPerPage \
  -i /user/halfak/streaming/enwiki-20150901/xml-bz2 \
  -o /user/halfak/streaming/enwiki-20150901/revdocs-bz2 \
  -r 5000
  --task-tiemout 3600000

Let's talk about that today.

JAllemandou claimed this task.Oct 30 2015, 9:31 AM

JAllemandou added a project: Analytics-Kanban.

JAllemandou removed a project: Analytics-Backlog.

JAllemandou moved this task from Next Up to In Progress on the Analytics-Kanban board.Oct 30 2015, 12:52 PM

JAllemandou moved this task from Next Up to In Progress on the Analytics-Kanban board.Nov 4 2015, 4:49 PM

Restricted Application added a subscriber: StudiesWorld. · View Herald TranscriptNov 4 2015, 4:49 PM

JAllemandou moved this task from In Progress to Paused on the Analytics-Kanban board.Nov 18 2015, 4:05 PM

I tested various memory, each failed.

I finally went and rewrote the job using core mapreduce API instead of using scrunch.

Job is still running but no error so far.

JAllemandou moved this task from In Progress to In Code Review on the Analytics-Kanban board.Nov 23 2015, 10:23 AM

JAllemandou moved this task from In Code Review to Done on the Analytics-Kanban board.Nov 23 2015, 5:02 PM

• Nuria closed this task as Resolved.Nov 27 2015, 8:49 PM

JsonRevisionsSortedPerPage failed on enwiki-20150901-pages-meta-history [13 pts] {paon}Closed, ResolvedPublicActions

Description

Related Objects

Event Timeline

JsonRevisionsSortedPerPage failed on enwiki-20150901-pages-meta-history [13 pts] {paon}
Closed, ResolvedPublic
Actions