Page MenuHomePhabricator

[Epic] Implement code generation for model makefile maintenance
Closed, ResolvedPublic

Description

It's beautiful in its own way, but this 3 kloc makefile is fragile and repetitive. It's hard to see what is variation and what is repeated. Large numbers of models with small variations are a good candidate for another level of automation.

Redesign how these builds happen. Some alternatives:

  1. Make is certainly nice to have in the stack because it handles dependencies well. Pure Make would be chafing however, and we would want to at least look into pattern rules and other Make tricks to reduce the amount of boilerplate required for each model.
  2. Code-generate the makefile, starting with declarative data about each model, rendered into make using templates.
  3. Directly execute the build tools under a thin custom workflow, reading from declarative configuration and logging actions.

The second option sounds most appealing to me.

Maybe we can also simplify some of the variations between models, i.e. maybe this is a case of updating stale boilerplate, and many models build well under default assumptions? Take an inventory of the variations.

Here are the make rules for the three models on fawiki,

############################# Persian Wikipedia ################################
datasets/fawiki.human_labeled_revisions.20k_2015.json:
    ./utility fetch_labels \
        https://labels.wmflabs.org/campaigns/fawiki/6/ > \
    datasets/fawiki.human_labeled_revisions.20k_2015.json

datasets/fawiki.labeled_revisions.20k_2015.json: \
        datasets/fawiki.human_labeled_revisions.20k_2015.json
    cat datasets/fawiki.human_labeled_revisions.20k_2015.json | \
    ./utility autolabel --host=https://fa.wikipedia.org \
        --trusted-groups=sysop,oversight,bot,rollbacker,checkuser,abusefilter,bureaucrat,flow-bot \
        --trusted-edits=1000 \
        --verbose > \
    datasets/fawiki.labeled_revisions.20k_2015.json

datasets/fawiki.labeled_revisions.w_cache.20k_2015.json: \
        datasets/fawiki.labeled_revisions.20k_2015.json
    cat datasets/fawiki.labeled_revisions.20k_2015.json | \
    revscoring extract \
        editquality.feature_lists.fawiki.reverted \
        editquality.feature_lists.fawiki.damaging \
        editquality.feature_lists.fawiki.goodfaith \
        --host https://fa.wikipedia.org \
        --verbose > \
    datasets/fawiki.labeled_revisions.w_cache.20k_2015.json

datasets/fawiki.sampled_revisions.2.20k_2015.json:
    wget -qO- http://quarry.wmflabs.org/run/59580/output/0/json-lines?download=true > \
    datasets/fawiki.sampled_revisions.2.20k_2015.json

datasets/fawiki.autolabeled_revisions.2.20k_2015.json: \
        datasets/fawiki.sampled_revisions.2.20k_2015.json
    cat datasets/fawiki.sampled_revisions.2.20k_2015.json | \
    ./utility autolabel --host=https://fa.wikipedia.org \
        --trusted-groups=sysop,oversight,bot,rollbacker,checkuser,abusefilter,bureaucrat,flow-bot \
        --trusted-edits=1000 \
        --verbose > \
    datasets/fawiki.autolabeled_revisions.2.20k_2015.json

tuning_reports/fawiki.reverted.md: \
        datasets/fawiki.labeled_revisions.w_cache.20k_2015.json
    cat datasets/fawiki.labeled_revisions.w_cache.20k_2015.json | \
    revscoring tune \
        config/classifiers.params.yaml \
        editquality.feature_lists.fawiki.reverted \
        reverted_for_damage \
        --cv-timeout=60 \
        --debug  > \
    tuning_reports/fawiki.reverted.md

models/fawiki.reverted.gradient_boosting.model: \
        datasets/fawiki.labeled_revisions.w_cache.20k_2015.json
    cat datasets/fawiki.labeled_revisions.w_cache.20k_2015.json | \
    revscoring cv_train \
        revscoring.scorer_models.GradientBoosting \
        editquality.feature_lists.fawiki.reverted \
        reverted_for_damage \
        --version=$(reverted_major_minor).0 \
        -p 'max_depth=7' \
        -p 'learning_rate=0.01' \
        -p 'max_features="log2"' \
        -p 'n_estimators=700' \
        $(test_statistics) \
        --balance-sample-weight \
        --center --scale  > \
    models/fawiki.reverted.gradient_boosting.model

tuning_reports/fawiki.damaging.md: \
        datasets/fawiki.labeled_revisions.w_cache.20k_2015.json
    cat datasets/fawiki.labeled_revisions.w_cache.20k_2015.json | \
    revscoring tune \
        config/classifiers.params.yaml \
        editquality.feature_lists.fawiki.damaging \
        damaging \
        --cv-timeout=60 \
        --debug > \
    tuning_reports/fawiki.damaging.md

models/fawiki.damaging.gradient_boosting.model: \
        datasets/fawiki.labeled_revisions.w_cache.20k_2015.json
    cat datasets/fawiki.labeled_revisions.w_cache.20k_2015.json | \
    revscoring cv_train \
        revscoring.scorer_models.GradientBoosting \
        $(test_statistics) \
        --balance-sample-weight \
        --center --scale  > \
    models/fawiki.reverted.gradient_boosting.model

tuning_reports/fawiki.damaging.md: \
        datasets/fawiki.labeled_revisions.w_cache.20k_2015.json
    cat datasets/fawiki.labeled_revisions.w_cache.20k_2015.json | \
    revscoring tune \
        config/classifiers.params.yaml \
        editquality.feature_lists.fawiki.damaging \
        damaging \
        --cv-timeout=60 \
        --debug > \
    tuning_reports/fawiki.damaging.md

models/fawiki.damaging.gradient_boosting.model: \
        datasets/fawiki.labeled_revisions.w_cache.20k_2015.json
    cat datasets/fawiki.labeled_revisions.w_cache.20k_2015.json | \
    revscoring cv_train \
        revscoring.scorer_models.GradientBoosting \
        editquality.feature_lists.fawiki.damaging \
        damaging \
        --version=$(damaging_major_minor).0 \
        -p 'max_depth=7' \
        -p 'learning_rate=0.01' \
        -p 'max_features="log2"' \
        -p 'n_estimators=700' \
        $(test_statistics) \
        --balance-sample-weight \
        --center --scale > \
    models/fawiki.damaging.gradient_boosting.model

tuning_reports/fawiki.goodfaith.md: \
        datasets/fawiki.labeled_revisions.w_cache.20k_2015.json
    cat datasets/fawiki.labeled_revisions.w_cache.20k_2015.json | \
    revscoring tune \
        config/classifiers.params.yaml \
        editquality.feature_lists.fawiki.goodfaith \
        goodfaith \
        --cv-timeout=60 \
        --debug > \
    tuning_reports/fawiki.goodfaith.md

models/fawiki.goodfaith.gradient_boosting.model: \
        datasets/fawiki.labeled_revisions.w_cache.20k_2015.json
    cat datasets/fawiki.labeled_revisions.w_cache.20k_2015.json | \
    revscoring cv_train \
        revscoring.scorer_models.GradientBoosting \
        editquality.feature_lists.fawiki.goodfaith \
        goodfaith \
        --version=$(goodfaith_major_minor).0 \
        -p 'max_depth=7' \
        -p 'learning_rate=0.01' \
        -p 'max_features="log2"' \
        -p 'n_estimators=700' \
        $(test_statistics) \
        --balance-sample-weight \
        --center --scale > \
    models/fawiki.goodfaith.gradient_boosting.model

fawiki_models: \
        models/fawiki.reverted.gradient_boosting.model \
        models/fawiki.damaging.gradient_boosting.model \
        models/fawiki.goodfaith.gradient_boosting.model

fawiki_tuning_reports: \
        tuning_reports/fawiki.reverted.md \
        tuning_reports/fawiki.damaging.md \
        tuning_reports/fawiki.goodfaith.md

Compare against one potential declarative form:

- defaults:
      # We cascade defaults, deferring to each wiki's configuration.
      scorer_model: GradientBoosting
      cv_train_params:
          learning_rate: 0.01
          max_depth: 5
          max_features: log2
          n_estimators: 700

      trusted_edit_count: 1000

      # FIXME: There's a lot I don't understand about how we're using "needs_review".
      include_unreviewed: false

-
    database: fawiki

    models:

        # Override one hyperparameter.
        - defaults:
              cv_train_params:
                  max_depth: 7

        - reverted
        - damaging
        - label: goodfaith
              # This is not really the case, but I wanted to show what further overrides would look like.
              cv_train_params:
                  max_depth: 6
          
    wikilabels_campaign:
        sample: sample2.20k_2015
        url: https://labels.wmflabs.org/campaigns/fawiki/6/

    sampling_query:
        # TODO: comment about this query, what is it and what does it do.  Annoying that the output doesn't permalink to the input.
        - name: sample2.20k_2015
           url: http://quarry.wmflabs.org/run/59580/output/0/json-lines?download=true

    trusted_groups:
        - sysop
        - oversight
        - bot
        - flow-bot
        - rollbacker
        - checkuser
        - abusefilter
        - bureaucrat

Related Objects

Mentioned In
rOEQf645b39a624d: Template nowiki and viwiki Bug: T186453 Bug: T168455
rOEQff6bbc676fe4: Template fawiki (#125) That was quite complex for various reasons Bug: T168455
rOEQ30fab8876022: Add final round of templates (#122) * Add final round of templates This adds…
rOEQdfdd71d0b193: Move remaining logic into modules Bug: T168455
rOEQ63ac6f69c7fa: Package codegen a bit TODO: * Move module into its own package * Extract logic…
rOEQ84c5355502ef: Enable damaging templating, template eswiki, eswikibooks, etwiki (#119) *…
rOEQ1a3c222ed178: Template several wikis (#120) azwiki, lvwiki, rowiki, ruwiki, sqwiki, trwiki…
rOEQ927ba06149ce: Template jawiki, kowiki, tawiki, ukwiki (#118) * Template jawiki, kowiki…
rOEQ5213ac2efa33: Template hrwiki, idwiki, iswiki, itwiki (#117) * Template hrwiki, idwiki…
rOEQa4106702e689: Templating (#114) * [WIP] Fun with templating * Do most of enwiki stuff *…
rOEQbc4ef074bcab: Template nowiki and viwiki Bug: T186453 Bug: T168455
rOEQaa083a105c71: Template fawiki (#125) That was quite complex for various reasons Bug: T168455
rOEQ94c2aa2e016c: Add final round of templates (#122) * Add final round of templates This adds…
rOEQd3a0cbf18f36: Move remaining logic into modules Bug: T168455
rOEQf0c2fc454533: Package codegen a bit TODO: * Move module into its own package * Extract logic…
rOEQ9164728b2b9a: Enable damaging templating, template eswiki, eswikibooks, etwiki (#119) *…
rOEQ9ceb37f19a73: Template several wikis (#120) azwiki, lvwiki, rowiki, ruwiki, sqwiki, trwiki…
rOEQ15b0f309188a: Template jawiki, kowiki, tawiki, ukwiki (#118) * Template jawiki, kowiki…
rOEQef7b595c47a5: Template hrwiki, idwiki, iswiki, itwiki (#117) * Template hrwiki, idwiki…
rOEQ2759a89c76c2: Templating (#114) * [WIP] Fun with templating * Do most of enwiki stuff *…
rOEQa96123b81245: Template nowiki and viwiki
rOEQ6596b012712a: Template fawiki (#125)
rOEQ091f60cc879c: Add final round of templates (#122)
rOEQ98721cb5115e: Move remaining logic into modules
rOEQ749e951bbb1a: Package codegen a bit
rOEQb0861ddf0a09: Enable damaging templating, template eswiki, eswikibooks, etwiki (#119)
rOEQ4d32dc61dc7f: Template several wikis (#120)
rOEQ014aea94277c: Template jawiki, kowiki, tawiki, ukwiki (#118)
rOEQdc81d012bae6: Template hrwiki, idwiki, iswiki, itwiki (#117)
rOEQ0ab92bed5d9a: Templating (#114)
Blog Post: Status Update (May 2, 2018)
rOEQ05c189a657e1: Template nowiki and viwiki
T187742: I broke merge_labels utility so bad
T185903: Train/test damaging and goodfaith model for Hungarian Wikipedia
rOEQ1b68e27353a1: Template fawiki (#125)
rOEQ24c65650fc68: Template fawiki
rOEQe34c8b8bc0eb: Add final round of templates (#122)
rOEQ54f9070e1f9f: Add final round of templates
rOEQ32874d8b0679: Move remaining logic into modules
rOEQab8e829881ed: Package codegen a bit
rOEQ0db6a6e1d8dc: Template several wikis (#120)
rOEQ1eeccf1924d3: Template several wikis
rOEQ895d60384843: Enable damaging templating, template eswiki, eswikibooks, etwiki (#119)
rOEQe5199af3d584: Enable damaging and goodfaith templating, template eswiki, eswikibooks, etwiki
rOEQ02e0e8428d3b: Template jawiki, kowiki, tawiki, ukwiki (#118)
rOEQ60c990fc30c0: Template jawiki, kowiki, tawiki, ukwiki
rOEQb73cdcd12a51: Template hrwiki, idwiki, iswiki, itwiki (#117)
rOEQcf03d4209f4c: Template hrwiki, idwiki, iswiki, itwiki
rOEQ0eacfde93af3: Templating (#114)
rOEQ2968fddaaebb: Improve the automation script substantially
rOEQ9ca792bedf38: Improve the automation script substantially

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
This comment was removed by awight.
awight updated the task description. (Show Details)

@Ladsgroup identified an important side benefit of doing this work: this will give us the opportunity to evaluate the different ways we generate data sets for each wiki, and decide which are the correct ways going forwards.

Two processes that we follow:

  1. Label the "needs_review" edits
  2. Label a balanced sample of "needs_review" and not edits

We started with (1) and then switched to (2) before realizing that was a bad idea and switching back to (1). So some of the wikis work in (2)'s pattern while most work under (1)'s pattern. We can probably formalize and parameterize these two patterns. (1) and (2) will need to be carefully explained.

Halfak renamed this task from Bury horrors of the editquality makefile to [Spec] Bury horrors of the editquality makefile.Jul 20 2017, 3:26 PM
Halfak added a project: editquality-modeling.
Halfak moved this task from Unorganized to Ideas on the Machine-Learning-Team board.
awight renamed this task from [Spec] Bury horrors of the editquality makefile to Investigate code generation for model makefile maintenance.Jul 22 2017, 8:15 AM
awight updated the task description. (Show Details)

TODO: the fawiki example above needs to be reworked, now that I understand more about what we vary. Something like, "labeled revisions come from the wikilabels output, and nothing else gets mixed in."

@Halfak @Ladsgroup This seemed like a fruitful project, and my prototype is c. 50% complete. Is there a good time to reprioritize?

I'm a big fan getting the whole thing more streamlined. I can probably pick this up starting two weeks from now.

@Ladsgroup Cool—I'd love to be involved, see the editquality#templating branch. I think it's worth talking through the design ahead of time, whenever you feel like it.

awight renamed this task from Investigate code generation for model makefile maintenance to Implement code generation for model makefile maintenance.Jan 18 2018, 7:28 PM

https://github.com/wiki-ai/editquality/pull/119

I used this opportunity to remove reverted model materials :P

https://github.com/wiki-ai/editquality/pull/120
If we get this merged: 26 wikis will be automated and 15 wikis will be manual. I reduce this in another batch

awight renamed this task from Implement code generation for model makefile maintenance to [Epic] Implement code generation for model makefile maintenance.Mar 19 2018, 7:15 PM
awight added a project: Epic.

At this state, I would consider it done, anything after this, is just improvements (which IMO should have their own ticket)