It might be a win to parallelize the portions of the makefile bound by remote resources, such as feature extraction.
Description
Description
Related Objects
Related Objects
Event Timeline
Comment Actions
Try running revscoring extract and checking top. It should parallelize. However there isn't parallelization for everything, so this task may still be valid.
Comment Actions
I gave the --extractors argument to extraction and it's working nicely. We should make the N_CPUS a tunable variable in the makefile.
There still might be a small piece left, that any steps which are not cpu-bound can be running in parallel, e.g. pulling text extracts from the wikis. When we have to rebuild all models, those could be churning in the background, and we train models in serial as their datasets are ready.