Page MenuHomePhabricator

Investigate runtime of tune with high number of estimators
Closed, DeclinedPublic

Description

Tune with 200 estimators in the config seems to hangup forever.

Traceback when the tune script is killed:

2018-03-16 04:19:11,498 DEBUG:revscoring.utilities.tune -- Cross-validated GradientBoosting with n_estimators=100, max_depth=5, max_features="log2", learning_rate=0.1 in
 79.197 minutes: pr_auc.macro=0.754                                                                                                                                      
^CProcess ForkPoolWorker-4:                                                                                                    
Process ForkPoolWorker-3:                                                                                                                                                
Process ForkPoolWorker-6:                                                                                                
Process ForkPoolWorker-12:                                                                                                                                               
Traceback (most recent call last):                                                                                       
  File "/home/codezee/ai/venv/bin/revscoring", line 11, in <module>                                                                                                      
Process ForkPoolWorker-5:                                                                                             
Process ForkPoolWorker-2:                                                                                                                                                
Process ForkPoolWorker-13:                                                                                                         
Process ForkPoolWorker-7:                                                                                                                                                
    load_entry_point('revscoring==2.2.0', 'console_scripts', 'revscoring')()                                                              
  File "/home/codezee/ai/venv/lib/python3.5/site-packages/revscoring-2.2.0-py3.5.egg/revscoring/revscoring.py", line 51, in main                                         
Traceback (most recent call last):                                                                                                               
  File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap                                                                                          
    self.run()                                                                                                                                
  File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run                                                                                                  
    self._target(*self._args, **self._kwargs)                                                                              
Traceback (most recent call last):                                                                                                                                       
  File "/usr/lib/python3.5/multiprocessing/pool.py", line 108, in worker                                                                         
    task = get()                                                                                                               
  File "/usr/lib/python3.5/multiprocessing/queues.py", line 342, in get                                                             
    with self._rlock:                                                                                                    
  File "/usr/lib/python3.5/multiprocessing/synchronize.py", line 96, in __enter__                                               
    return self._semlock.__enter__()                                                                                     
Traceback (most recent call last):

Profiling with individual cv_train calls yielded the following for 200 estimators with 2 folds:

real    143m18.515s                                               
user    188m31.496s                                               
sys     0m42.880s

The fitness measures were:
pr_auc (micro=0.815, macro=0.783)

Event Timeline

Sumit renamed this task from Drafttopic estimators take very less time but tune hangs up forever to Drafttopic estimators take very less time to train but tune hangs up forever.Mar 21 2018, 2:33 PM

I wonder if you could figure out where the hangup is happening by adding "--debug" to the tune utility call.

I wonder if you could figure out where the hangup is happening by adding "--debug" to the tune utility call.

Added in the description

Sumit renamed this task from Drafttopic estimators take very less time to train but tune hangs up forever to Investigate runtime of tune with high number of estimators.Mar 21 2018, 2:50 PM