Tune with 200 estimators in the config seems to hangup forever.
Traceback when the tune script is killed:
2018-03-16 04:19:11,498 DEBUG:revscoring.utilities.tune -- Cross-validated GradientBoosting with n_estimators=100, max_depth=5, max_features="log2", learning_rate=0.1 in 79.197 minutes: pr_auc.macro=0.754 ^CProcess ForkPoolWorker-4: Process ForkPoolWorker-3: Process ForkPoolWorker-6: Process ForkPoolWorker-12: Traceback (most recent call last): File "/home/codezee/ai/venv/bin/revscoring", line 11, in <module> Process ForkPoolWorker-5: Process ForkPoolWorker-2: Process ForkPoolWorker-13: Process ForkPoolWorker-7: load_entry_point('revscoring==2.2.0', 'console_scripts', 'revscoring')() File "/home/codezee/ai/venv/lib/python3.5/site-packages/revscoring-2.2.0-py3.5.egg/revscoring/revscoring.py", line 51, in main Traceback (most recent call last): File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap self.run() File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) Traceback (most recent call last): File "/usr/lib/python3.5/multiprocessing/pool.py", line 108, in worker task = get() File "/usr/lib/python3.5/multiprocessing/queues.py", line 342, in get with self._rlock: File "/usr/lib/python3.5/multiprocessing/synchronize.py", line 96, in __enter__ return self._semlock.__enter__() Traceback (most recent call last):
Profiling with individual cv_train calls yielded the following for 200 estimators with 2 folds:
real 143m18.515s user 188m31.496s sys 0m42.880s
The fitness measures were:
pr_auc (micro=0.815, macro=0.783)