Investigate parallelizing the model makefile
Open, LowPublic
Actions

Assigned To

None

Authored By

	awight
	Jun 26 2017, 9:44 PM

Description

It might be a win to parallelize the portions of the makefile bound by remote resources, such as feature extraction.

Related Objects

Mentioned In: T170650: [Investigate] Hadoop integration for ORES training

Event Timeline

awight created this task.Jun 26 2017, 9:44 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 26 2017, 9:44 PM

Try running revscoring extract and checking top. It should parallelize. However there isn't parallelization for everything, so this task may still be valid.

Halfak added projects: editquality-modeling, articlequality-modeling, draftquality-modeling, Performance Issue.Jun 29 2017, 2:55 PM

Restricted Application added a project: artificial-intelligence. · View Herald TranscriptJun 29 2017, 2:55 PM

Halfak triaged this task as Low priority.Jun 29 2017, 2:55 PM

Halfak moved this task from Unsorted to Research & analysis on the Machine-Learning-Team board.

I gave the --extractors argument to extraction and it's working nicely. We should make the N_CPUS a tunable variable in the makefile.

There still might be a small piece left, that any steps which are not cpu-bound can be running in parallel, e.g. pulling text extracts from the wikis. When we have to rebuild all models, those could be churning in the background, and we train models in serial as their datasets are ready.

awight mentioned this in T170650: [Investigate] Hadoop integration for ORES training.Jul 14 2017, 12:21 AM

Sumit subscribed.Jul 24 2017, 4:38 PM

awight unsubscribed.Mar 21 2019, 4:01 PM

Halfak edited projects, added Machine-Learning-Team (Research); removed Machine-Learning-Team.Apr 2 2019, 9:33 PM

Restricted Application edited projects, added Machine-Learning-Team; removed Machine-Learning-Team (Research). · View Herald TranscriptApr 2 2019, 9:33 PM

Harej edited projects, added Machine-Learning-Team (Research); removed Machine-Learning-Team.Apr 3 2019, 4:33 AM

calbon removed a project: Machine-Learning-Team (Research).Sep 23 2020, 4:38 PM

Investigate parallelizing the model makefileOpen, LowPublicActions

Description

Related Objects

Event Timeline

Investigate parallelizing the model makefile
Open, LowPublic
Actions