I'm running a test notebook to explore the sklearn operations on PAWS following a 20newsgroups example.
The notebook runs fine until I get to the grid search code designed to compute optimal parameters for the learning function. The GridSearchCV() requests a job engine that uses the number of available threads. When I reach this call in the notebook it begins excuting but never completes (ie. it remains with a * and doesn't get a number).
gs_clf = GridSearchCV(text_clf, parameters, n_jobs=-1) gs_clf = gs_clf.fit(twenty_train.data, twenty_train.target) gs_clf.best_score_
I'm trying to find out how to debug this step in PAWS. I've looked at the Kubernetes worker health to see if I can detect activity via a load spike. This doesn't provide any useful data, afaik.
I have downloaded and run the notebook on a local resource (4-core) and the above steps execute successfully, though the grid search does take about 2 minutes to run. The code on PAWS doesn't ever complete even after many minutes.
Are there any hints or suggests for debugging this type of error in PAWS?