It's beautiful in its own way, but this 3 kloc [[ https://github.com/wiki-ai/editquality/blob/master/Makefile | makefile ]] is fragile and repetitive. It's hard to see what is variation and what is repeated. Large numbers of models with small variations are a good candidate for another level of automation.
Redesign how these builds happen. Some alternatives,:
# Make is certainly nice to have in the stack because it handles dependencies for youwell. Pure Make would be chafing however, butand we would needwant to take advantage ofat least look into pattern rules and other Make tricks to reduce the amount of codeboilerplate required for each model.
# Code-generate the makefile, starting with declarative data about each model, rendered into make using templates.
# Directly execute the build tools wiunder a thin a custom workflow, reading from declarative configuration and logging actions.
The last twosecond options sounds most appealing to me, but I'll look into the make pattern rules as a first step towards generalization. to me.
Maybe we can also simplify some of the variations between models, i.e. maybe this is a case of updating stale boilerplate, and mostany models build well under default assumptions? Take an inventory of the variations.
Here are the make rules for the three models on fawiki,
```
############################# Persian Wikipedia ################################
datasets/fawiki.human_labeled_revisions.20k_2015.json:
./utility fetch_labels \
https://labels.wmflabs.org/campaigns/fawiki/6/ > \
datasets/fawiki.human_labeled_revisions.20k_2015.json
datasets/fawiki.labeled_revisions.20k_2015.json: \
datasets/fawiki.human_labeled_revisions.20k_2015.json
cat datasets/fawiki.human_labeled_revisions.20k_2015.json | \
./utility autolabel --host=https://fa.wikipedia.org \
--trusted-groups=sysop,oversight,bot,rollbacker,checkuser,abusefilter,bureaucrat,flow-bot \
--trusted-edits=1000 \
--verbose > \
datasets/fawiki.labeled_revisions.20k_2015.json
datasets/fawiki.labeled_revisions.w_cache.20k_2015.json: \
datasets/fawiki.labeled_revisions.20k_2015.json
cat datasets/fawiki.labeled_revisions.20k_2015.json | \
revscoring extract \
editquality.feature_lists.fawiki.reverted \
editquality.feature_lists.fawiki.damaging \
editquality.feature_lists.fawiki.goodfaith \
--host https://fa.wikipedia.org \
--verbose > \
datasets/fawiki.labeled_revisions.w_cache.20k_2015.json
datasets/fawiki.sampled_revisions.2.20k_2015.json:
wget -qO- http://quarry.wmflabs.org/run/59580/output/0/json-lines?download=true > \
datasets/fawiki.sampled_revisions.2.20k_2015.json
datasets/fawiki.autolabeled_revisions.2.20k_2015.json: \
datasets/fawiki.sampled_revisions.2.20k_2015.json
cat datasets/fawiki.sampled_revisions.2.20k_2015.json | \
./utility autolabel --host=https://fa.wikipedia.org \
--trusted-groups=sysop,oversight,bot,rollbacker,checkuser,abusefilter,bureaucrat,flow-bot \
--trusted-edits=1000 \
--verbose > \
datasets/fawiki.autolabeled_revisions.2.20k_2015.json
tuning_reports/fawiki.reverted.md: \
datasets/fawiki.labeled_revisions.w_cache.20k_2015.json
cat datasets/fawiki.labeled_revisions.w_cache.20k_2015.json | \
revscoring tune \
config/classifiers.params.yaml \
editquality.feature_lists.fawiki.reverted \
reverted_for_damage \
--cv-timeout=60 \
--debug > \
tuning_reports/fawiki.reverted.md
models/fawiki.reverted.gradient_boosting.model: \
datasets/fawiki.labeled_revisions.w_cache.20k_2015.json
cat datasets/fawiki.labeled_revisions.w_cache.20k_2015.json | \
revscoring cv_train \
revscoring.scorer_models.GradientBoosting \
editquality.feature_lists.fawiki.reverted \
reverted_for_damage \
--version=$(reverted_major_minor).0 \
-p 'max_depth=7' \
-p 'learning_rate=0.01' \
-p 'max_features="log2"' \
-p 'n_estimators=700' \
$(test_statistics) \
--balance-sample-weight \
--center --scale > \
models/fawiki.reverted.gradient_boosting.model
tuning_reports/fawiki.damaging.md: \
datasets/fawiki.labeled_revisions.w_cache.20k_2015.json
cat datasets/fawiki.labeled_revisions.w_cache.20k_2015.json | \
revscoring tune \
config/classifiers.params.yaml \
editquality.feature_lists.fawiki.damaging \
damaging \
--cv-timeout=60 \
--debug > \
tuning_reports/fawiki.damaging.md
models/fawiki.damaging.gradient_boosting.model: \
datasets/fawiki.labeled_revisions.w_cache.20k_2015.json
cat datasets/fawiki.labeled_revisions.w_cache.20k_2015.json | \
revscoring cv_train \
revscoring.scorer_models.GradientBoosting \
$(test_statistics) \
--balance-sample-weight \
--center --scale > \
models/fawiki.reverted.gradient_boosting.model
tuning_reports/fawiki.damaging.md: \
datasets/fawiki.labeled_revisions.w_cache.20k_2015.json
cat datasets/fawiki.labeled_revisions.w_cache.20k_2015.json | \
revscoring tune \
config/classifiers.params.yaml \
editquality.feature_lists.fawiki.damaging \
damaging \
--cv-timeout=60 \
--debug > \
tuning_reports/fawiki.damaging.md
models/fawiki.damaging.gradient_boosting.model: \
datasets/fawiki.labeled_revisions.w_cache.20k_2015.json
cat datasets/fawiki.labeled_revisions.w_cache.20k_2015.json | \
revscoring cv_train \
revscoring.scorer_models.GradientBoosting \
editquality.feature_lists.fawiki.damaging \
damaging \
--version=$(damaging_major_minor).0 \
-p 'max_depth=7' \
-p 'learning_rate=0.01' \
-p 'max_features="log2"' \
-p 'n_estimators=700' \
$(test_statistics) \
--balance-sample-weight \
--center --scale > \
models/fawiki.damaging.gradient_boosting.model
tuning_reports/fawiki.goodfaith.md: \
datasets/fawiki.labeled_revisions.w_cache.20k_2015.json
cat datasets/fawiki.labeled_revisions.w_cache.20k_2015.json | \
revscoring tune \
config/classifiers.params.yaml \
editquality.feature_lists.fawiki.goodfaith \
goodfaith \
--cv-timeout=60 \
--debug > \
tuning_reports/fawiki.goodfaith.md
models/fawiki.goodfaith.gradient_boosting.model: \
datasets/fawiki.labeled_revisions.w_cache.20k_2015.json
cat datasets/fawiki.labeled_revisions.w_cache.20k_2015.json | \
revscoring cv_train \
revscoring.scorer_models.GradientBoosting \
editquality.feature_lists.fawiki.goodfaith \
goodfaith \
--version=$(goodfaith_major_minor).0 \
-p 'max_depth=7' \
-p 'learning_rate=0.01' \
-p 'max_features="log2"' \
-p 'n_estimators=700' \
$(test_statistics) \
--balance-sample-weight \
--center --scale > \
models/fawiki.goodfaith.gradient_boosting.model
fawiki_models: \
models/fawiki.reverted.gradient_boosting.model \
models/fawiki.damaging.gradient_boosting.model \
models/fawiki.goodfaith.gradient_boosting.model
fawiki_tuning_reports: \
tuning_reports/fawiki.reverted.md \
tuning_reports/fawiki.damaging.md \
tuning_reports/fawiki.goodfaith.md
```
Compare against one potential declarative form:
```
-
# We cascade these default values withs, deferring to each databasewiki's configuration.
database: default
scorer_model: GradientBoosting
cv_train_params:
learning_rate: 0.01
max_depth: 75
max_features: log2
n_estimators: 700
trusted_edit_count: 1000
# FIXME: There's a lot I don't understand about how we're using "needs_review".
include_unreviewed: false
-
database: fawiki
models:
- revertedlabel: reverted
# Override one hyperparameter. We could also have a fawiki.default config node :-/
cv_train_params:
max_depth: 7
-- label: damaging
cv_train_params:
max_depth: 7
- label: goodfaith
# TODO: revscoring cv_train _params vary slightly. Do we want to preserve that?:
max_depth: 7
wikilabels_campaign:
https://labels.wmflabs.org/campaigns/fawiki/6/
sampling_query:
# TODO: comment about this query, what is it and what does it do. Annoying that the output doesn't permalink to the input.
- name: sample2.20k_2015
url: http://quarry.wmflabs.org/run/59580/output/0/json-lines?download=true
trusted_groups:
- sysop
- oversight
- bot
- flow-bot
- rollbacker
- checkuser
- abusefilter
- bureaucrat
```