Page MenuHomePhabricator

editquality make file rules doesn't work for revscoring tune
Closed, ResolvedPublic

Description

For example:

cat datasets/dewiki.autolabeled_revisions.w_cache.20k_2015.json | \
revscoring tune \
	config/classifiers.params.yaml \
	editquality.feature_lists.dewiki.reverted \
	reverted_for_damage \
	roc_auc.labels.true \
	--label-weight "true=10" \
	--pop-rate "true=0.049775581219426095" \
	--pop-rate "false=0.950224418780574" \
	--center --scale \
	--cv-timeout=60 \
	--debug > tuning_reports/dewiki.reverted.md
Usage:
        tune <params-config> <features> <label> <statistic>
             [-w=<lw>]... [-r=<lp>]...
             [--labels=<labels>]
             [--minimize]
             [--observations=<path>]
             [--folds=<num>]
             [--report=<path>]
             [--processes=<num>]
             [--cv-timeout=<mins>]
             [--verbose] [--debug]
Makefile:442: recipe for target 'tuning_reports/dewiki.reverted.md' failed
make: *** [tuning_reports/dewiki.reverted.md] Error 1
make: *** Deleting file 'tuning_reports/dewiki.reverted.md'

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Still not working:

(p3)ladsgroup@ores-misc-01:~/editquality$ make tuning_reports/hrwiki.reverted.md 
cat datasets/hrwiki.autolabeled_revisions.w_cache.20k_2017.json | \
revscoring tune \
	config/classifiers.params.yaml \
	editquality.feature_lists.hrwiki.reverted \
	reverted_for_damage \
	roc_auc.labels.true \
	--label-weight "true=10" \
	--pop-rate "true=0.07927353670258512" \
	--pop-rate "false=0.9207264632974149" \
	--center --scale \
	--cv-timeout=60 \
	--debug > tuning_reports/hrwiki.reverted.md
2017-09-01 10:56:55,557 INFO:revscoring.utilities.tune -- Reading feature values & labels...
2017-09-01 10:58:36,425 DEBUG:revscoring.utilities.tune -- Starting up multiprocessing pool (processes=8)
2017-09-01 10:58:36,482 WARNING:revscoring.utilities.tune -- Model sklearn.linear_model.LogisticRegression does not have a train() method.
2017-09-01 10:58:36,483 WARNING:revscoring.utilities.tune -- Model sklearn.naive_bayes.BernoulliNB does not have a train() method.
2017-09-01 10:58:36,483 WARNING:revscoring.utilities.tune -- Model sklearn.naive_bayes.GaussianNB does not have a train() method.
2017-09-01 10:58:36,483 WARNING:revscoring.utilities.tune -- Model sklearn.ensemble.GradientBoostingClassifier does not have a train() method.
2017-09-01 10:58:36,484 WARNING:revscoring.utilities.tune -- Model sklearn.ensemble.RandomForestClassifier does not have a train() method.
2017-09-01 10:58:36,484 INFO:revscoring.utilities.tune -- Running gridsearch for 0 model/params pairs ...
2017-09-01 10:58:36,484 INFO:revscoring.utilities.tune -- # Top scoring configurations
2017-09-01 10:58:36,488 INFO:revscoring.utilities.tune -- 
| model   | roc_auc.labels.true   | params   |
||

Another thing:

2017-09-05 21:27:29,177 INFO:revscoring.utilities.tune --
| model                  |   roc_auc.labels.true | params                                                                        |
|:-----------------------|----------------------:|:------------------------------------------------------------------------------|
| GradientBoosting       |                0.9249 | max_features="log2", n_estimators=700, learning_rate=0.01, max_depth=7        |
| GradientBoosting       |                0.9246 | max_features="log2", n_estimators=300, learning_rate=0.1, max_depth=3         |
| GradientBoosting       |                0.9244 | max_features="log2", n_estimators=700, learning_rate=0.01, max_depth=5        |
| GradientBoosting       |                0.9241 | max_features="log2", n_estimators=500, learning_rate=0.1, max_depth=3         |
| GradientBoosting       |                0.9239 | max_features="log2", n_estimators=500, learning_rate=0.01, max_depth=7        |
| GradientBoosting       |                0.9237 | max_features="log2", n_estimators=300, learning_rate=0.1, max_depth=7         |
| RandomForestClassifier |                0.9236 | criterion="entropy", max_features="log2", n_estimators=80, min_samples_leaf=3 |
| GradientBoosting       |                0.9235 | max_features="log2", n_estimators=100, learning_rate=0.1, max_depth=5         |
| RandomForestClassifier |                0.9235 | criterion="entropy", max_features="log2", n_estimators=80, min_samples_leaf=7 |
| GradientBoosting       |                0.9234 | max_features="log2", n_estimators=700, learning_rate=0.1, max_depth=3         |
Traceback (most recent call last):
  File "/home/ladsgroup/p3/bin/revscoring", line 9, in <module>
    load_entry_point('revscoring==2.0.5', 'console_scripts', 'revscoring')()
  File "/home/ladsgroup/p3/lib/python3.4/site-packages/revscoring-2.0.5-py3.4.egg/revscoring/revscoring.py", line 51, in main
    module.main(sys.argv[2:])
  File "/home/ladsgroup/p3/lib/python3.4/site-packages/revscoring-2.0.5-py3.4.egg/revscoring/utilities/tune.py", line 150, in main
    processes, cv_timeout, verbose)
  File "/home/ladsgroup/p3/lib/python3.4/site-packages/revscoring-2.0.5-py3.4.egg/revscoring/utilities/tune.py", line 226, in run
    param_statistics.sort(key=lambda v: v[1], reverse=maximize)
TypeError: unorderable types: NoneType() < NoneType()
Makefile:1716: recipe for target 'tuning_reports/hrwiki.reverted.md' failed
make: *** [tuning_reports/hrwiki.reverted.md] Error 1
make: *** Deleting file 'tuning_reports/hrwiki.reverted.md'