Page MenuHomePhabricator

Use multithreading in test_model
Closed, ResolvedPublic


This might be an unusual use case, but I found myself running test_model on a huge data set, and discovered that it's single-threaded. It would be nice to have a parallel job option.

Event Timeline

awight created this task.Jul 6 2017, 2:59 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 6 2017, 2:59 AM
awight added a comment.Jul 6 2017, 3:47 AM
This comment was removed by awight.
awight added a comment.Jul 6 2017, 4:32 AM

I took a look at this to see if it'd be easy, and I found that test() is built into Model. Unfortunately, that means that it doesn't benefit from the worldly threading stuff in ScoreProcessor. Intriguingly, Model#test is never overridden by any subclasses. This makes me think that we can split out model testing into its own component.

On a less dramatic note, it's not clear how to figure out a good default number of CPU workers for model testing. If we're calling from cv_train, we want just 1 test thread per fold, but for a standalone test_model we would want to default to the machine's number of cores. Perhaps our scoring tools should share a worker pool, and we give all CPU workers equal priority. This would let us queue up runs en masse, and make efficient use of the hardware.

Halfak triaged this task as Low priority.Jul 20 2017, 3:12 PM
Halfak moved this task from Untriaged to New development on the Scoring-platform-team board.
Restricted Application added a subscriber: TerraCodes. · View Herald TranscriptNov 1 2017, 5:20 PM

@awight: good first bug tasks are self-contained, non-controversial issues with a clear approach and should be well-described with pointers to help the new contributor. Given the current short task description and your own comment in T169843#3410396 I'm removing the good first bug tag. Please re-add the tag once the task description has been polished and provides sufficient information for a new contributor. Thanks!

awight removed a subscriber: awight.Mar 21 2019, 4:01 PM
Halfak closed this task as Resolved.Mar 28 2019, 5:26 PM