Mon, Aug 21
/me likes @demon's post.
Admins are interested in opening the discussion, and would like to see a demo of what ORES can accomplish. https://hi.wikipedia.org/wiki/सदस्य_वार्ता:Hindustanilanguage#Reaching_out_for_help_with_ORES
Sun, Aug 13
@Halfak Found a Hindi word list which is ready for review: https://meta.wikimedia.org/wiki/Research:Revision_scoring_as_a_service/Word_lists/hi
I left a note on the User talk page for @hindustanilanguage, this discussion is currently at an early, introductory phase.
@Mahir256: Thank you for the correction!
I realized after creating this task that we already have a Tamil "reverted" model, so closing as invalid.
Sat, Aug 12
@Isarra It's just that a simple change would get us out of the danger zone here. I see what you're saying about how my mind is going straight to the gutter, which is really my problem, though after self-reflection I'll admit that neo-Nazis beating people up in Charlottesville in real life is fueling the fire.
perhaps "cuisses de grenouille" ?
Fri, Aug 11
Thu, Aug 10
Tue, Aug 8
Mon, Aug 7
o/ Looking more stable than ever these days, though!
Collab-Scoring meeting minutes resolving this task are recorded in https://www.mediawiki.org/w/index.php?title=Topic:Tvsudsw9odbah87e&topic_showPostId=tvsudsx3arjn0tsa#flow-post-tvsudsx3arjn0tsa
@Zache: nudge--we're hoping to get your opinion on the question above. Just spot-check the data in that .json.bz2 file, and let us know if you're confident that approvals are roughly as good as the Wiki Labels output.
TODO: the fawiki example above needs to be reworked, now that I understand more about what we vary. Something like, "labeled revisions come from the wikilabels output, and nothing else gets mixed in."
Sun, Aug 6
We can get pretty close to correct, or at least better than now. Agreed
that the UI flag is a good place to start, but I'm pretty sure we'll be
showing the "r" to some people that don't use it (since that's preferable
to having it lacking when it's in use).
Sat, Aug 5
Fri, Aug 4
Thu, Aug 3
Work is described in more detail here:
@Zache We would love if you weighed in with how you would like to proceed. Our second experiment showed a slight drop in fitness which we can't fully explain, but @Halfak is considering mixing the Flagged Revs approval set into our training and test data for fiwiki anyway.
Wed, Aug 2
I have some more results--allow me to muddle through an attempt at interpreting them.
- This model catches more of the non-damaging edits.
Tue, Aug 1
This script gives us 310k rows in the desired format, but this form will only work on "stat" machines. Needs to be tweaked to run on Quarry, and given temporary table privs on a new db.
There are many more approval logs than I had realized at first. log_params was only serialized beginning in December 2016, and when we relax the serialized data match on log_params, there are about 320k rows to work with. I'll try to include this data and parse both the legacy and new format params.
I'm doing another iteration of this experiment, addressing the critiques that came up:
- Omit approvals where more than one revision was approved.
- Omit approvals which were later reverted for being damaging.
- Omit approvals by users "SeulojaBot" and "Zache".
Fri, Jul 28
If we do this again, noting one minor thing I messed up: We should have thrown out multi-revision approvals, but I never added that to my query.
Thu, Jul 27
Wed, Jul 26
Comparing to revscoring model_info models/fiwiki.damaging.gradient_boosting.model, the model trained on flagged_revisions found fewer of the damaging edits.
make models/fiwiki.damaging_w_flaggedrevs.gradient_boosting.model revscoring test_model \ models/fiwiki.damaging_w_flaggedrevs_wo_testinfo.gradient_boosting.model \ damaging \ --observations=datasets/fiwiki.labeled_revisions_testing.w_cache.5k_2016.json > models/fiwiki.damaging_w_flaggedrevs.gradient_boosting.model 2017-07-26 18:22:38,669 INFO:revscoring.utilities.test_model -- Testing model... ScikitLearnClassifier - type: GradientBoosting - params: max_features="log2", min_samples_leaf=1, loss="deviance", subsample=1.0, scale=true, max_leaf_nodes=null, random_state=null, balanced_sample=false, center=true, presort="auto", init=null, min_samples_split=2, max_depth=5, learning_rate=0.01, balanced_sample_weight=true, min_weight_fraction_leaf=0.0, n_estimators=700, warm_start=false, verbose=0 - version: 0.0.1 - trained: 2017-07-25T20:50:13.806134
@demon Thanks! I've updated the description.
@demon We're fine with deploying from WMF production repos, I'm sure we can figure something out to push mirrored code or just make these the masters for deployment. In other words, we're not trying to deploy directly from GitHub.
Tue, Jul 25
@Ladsgroup I see that all the subtasks are complete--should we revolve the epic?
fwiw, downgrading is a decent workaround: