Page MenuHomePhabricator

Supporting ORES on Huggle
Closed, ResolvedPublic

Description

Now ORES extension is activated on some wikis.
Please add possibility to filter recent changes with &hidenondamaging=1 filter like this link.

Event Timeline

I think Huggle has a basic support already. @Halfak Can you elaborate?

I think Huggle has a basic support already. @Halfak Can you elaborate?

So It has a bug I captured fa.wikipedia's recent changes and huggle's screenshot. huggle doesn't suggest the ORES's suggestions.

425238435_8544_11364924436185237860.jpg (450×800 px, 111 KB)

425429542_8528_13288756835134326309.jpg (450×800 px, 54 KB)

Hi

The basic support means it's able to get score and display it. This requires configuration on project since the last version of huggle see https://en.wikipedia.org/wiki/Wikipedia:Huggle/Config#Prediction for example

Petrb triaged this task as Medium priority.Jul 10 2016, 11:15 AM

If you add this to config page of huggle on your wiki, it will start evaluating ORES scores as well:

// ORES see meta.wikimedia.org/wiki/Objective_Revision_Evaluation_Service
ores-enabled:true
ores-supported:true
ores-url:https://ores.wmflabs.org/scores/
ores-amplifier:220

Higher the amplifier, more weight the ORES score will have for huggle

I don't really know how "hidenondamaging" works, can you elaborate on that?

As @Petrb says, ORES plays a minor role in Huggle's predictions. I think that it's likely that we could out-perform Huggle's rule-based scoring system if we went head-to-head. Petrb, can you produce a dataset of huggle scores (with and without ORES involved) if I give you a sample of revision IDs that we have human labels for?

Hi, I wouldn't exactly say it's minor, right now ORES already has some serious impact on prediction, (I successfully identified many vandalism-edits basically only thanks to ORES as there was nothing else triggering alerts for these).

What you ask for is possible, I can create a debugging feed into which I put old revisions for processing, but keep in mind that doing that may result in completely different scores than when I get these revisions in a moment as they are made, which is more typical for huggle as it process the stream in real time.

The reason for this, is that Huggle evaluates that user as well, and in case that someone's edit was reverted and user was warned, the score would be much higher for such edit. So if you give me a list of revision ID's that are bad, and which likely were already reverted and their authors are considered bad editors now (they have warnings or are blocked) then Huggle will give them higher score than it would before these edits were reverted and their authors were unknown with blank talk page.

So maybe, more efficient would be doing it other way, have Huggle evaluate some edits from real-time RC feed and then run ORES on these as well. That however could have similar side effect on ORES side :)

The best comparison would be probably to simply run Huggle with ORES and without ORES simultaneously, which is also possible, but not on a precompiled list of revisions but random ones from real time feed.

BTW what exactly is a point of this experiment? I believe you that ORES is more accurate than Huggle's internal scoring system, that's why I implemented it into Huggle in a first place :)

One more thing: this task it about supporting ORES in Huggle. We already do that, so basically reason why I didn't close this task as resolved, is that I still see there is a space for improvement.

I had, some time ago a videoconference with Joe Matazzoni and @Quiddity, part of this was also ORES (we basically were looking for a ways to make vandalism fighting more user / newcomer friendly so that retention of new users is not so badly impacted by tools like huggle).

Part of this talk was implementing this thing "human labels" or whatever into Huggle as well, but from what I understood, this new thing is available only on some non-EN wikis right now? If I could have more information, I would happily start implementing it, but right now I don't know much about these "human labels" or whatever that is.

So to make it clear: current status is - Huggle does support ORES scoring mechanism, that is, it does lookup ORES score for every single edit if enabled per wiki, see https://en.wikipedia.org/wiki/Wikipedia:Huggle/Config#Prediction and adds this score to the total score of the edit (this "total score" is combined score from all "scoring providers" which in this moment is only Huggle itself, ORES and eventually clue bot, which seems dead).

No other features of ORES are supported, but mostly because I am not even aware of them

@Halfak

Out of curiosity here are some data for comparison

Sun Oct 2 17 : 36 : 18 2016  SCORING : edit 742245972 huggle - 440 ores - 186
Sun Oct 2 17 : 36 : 18 2016  SCORING : edit 742245970 huggle - 903 ores - 229
Sun Oct 2 17 : 36 : 17 2016  SCORING : edit 742245969 huggle - 1033 ores 171
Sun Oct 2 17 : 36 : 13 2016  SCORING : edit 742245966 huggle - 987 ores 1
Sun Oct 2 17 : 36 : 12 2016  SCORING : edit 742245965 huggle 271 ores 251
Sun Oct 2 17 : 36 : 12 2016  SCORING : edit 742245964 huggle - 1000 ores - 122
Sun Oct 2 17 : 36 : 08 2016  SCORING: edit 742245941 huggle 161 ores 131
Sun Oct 2 17 : 36 : 02 2016  SCORING : edit 742245946 huggle - 1104 ores - 196
Sun Oct 2 17 : 36 : 02 2016  SCORING : edit 742245949 huggle - 52 ores - 72
Sun Oct 2 17 : 36 : 02 2016  SCORING : edit 742245947 huggle - 509 ores - 257
Sun Oct 2 17 : 35 : 59 2016  SCORING : edit 742245945 huggle - 31 ores - 51
Sun Oct 2 17 : 35 : 59 2016  SCORING : edit 742245943 huggle 1125 ores 137
Sun Oct 2 17 : 35 : 58 2016  SCORING : edit 742245942 huggle 164 ores 144
Sun Oct 2 17 : 35 : 56 2016  SCORING : edit 742245928 huggle 19 ores 49
Sun Oct 2 17 : 35 : 56 2016  SCORING : edit 742245939 huggle 159 ores 129
Sun Oct 2 17 : 35 : 54 2016  SCORING : edit 742245934 huggle - 538 ores - 236
Sun Oct 2 17 : 35 : 52 2016  SCORING : edit 742245931 huggle - 604 ores - 206
Sun Oct 2 17 : 35 : 47 2016  SCORING : edit 742245921 huggle 92 ores 72
Sun Oct 2 17 : 35 : 44 2016  SCORING : edit 742245915 huggle - 2 ores - 10
Sun Oct 2 17 : 35 : 43 2016  SCORING : edit 742245913 huggle 1216 ores 186
Sun Oct 2 17 : 35 : 43 2016  SCORING : edit 742245914 huggle 120 ores 90
Sun Oct 2 17 : 35 : 43 2016  SCORING : edit 742245911 huggle 808 ores 62
Sun Oct 2 17 : 35 : 41 2016  SCORING : edit 742245906 huggle 5 ores - 15
Sun Oct 2 17 : 35 : 40 2016  SCORING : edit 742245897 huggle - 462 ores - 160
Sun Oct 2 17 : 35 : 40 2016  SCORING : edit 742245904 huggle - 1726 ores - 186
Sun Oct 2 17 : 35 : 38 2016  SCORING : edit 742245902 huggle - 118 ores - 138
Sun Oct 2 17 : 35 : 38 2016  SCORING : edit 742245899 huggle 28 ores 30
Sun Oct 2 17 : 35 : 37 2016  SCORING : edit 742245898 huggle 143 ores 123
Sun Oct 2 17 : 35 : 37 2016  SCORING : edit 742245894 huggle 221 ores 201
Sun Oct 2 17 : 35 : 34 2016  SCORING : edit 742245890 huggle - 2123 ores - 237
Sun Oct 2 17 : 35 : 31 2016  SCORING : edit 742245885 huggle - 837 ores - 253
Sun Oct 2 17 : 35 : 30 2016  SCORING : edit 742245881 huggle 57 ores 37
Sun Oct 2 17 : 35 : 29 2016  SCORING : edit 742245880 huggle - 122 ores - 142
Sun Oct 2 17 : 35 : 29 2016  SCORING : edit 742245878 huggle 11 ores - 9
Sun Oct 2 17 : 35 : 28 2016  SCORING : edit 742245871 huggle 221 ores 201
Sun Oct 2 17 : 35 : 27 2016  SCORING : edit 742245874 huggle - 6 ores - 4
Sun Oct 2 17 : 35 : 23 2016  SCORING : edit 742245869 huggle - 700 ores - 260
Sun Oct 2 17 : 35 : 23 2016  SCORING : edit 742245867 huggle 228 ores 208
Sun Oct 2 17 : 35 : 21 2016  SCORING : edit 742245864 huggle 222 ores 202
Sun Oct 2 17 : 35 : 20 2016  SCORING : edit 742245852 huggle 223 ores 193
Sun Oct 2 17 : 35 : 20 2016  SCORING : edit 742245861 huggle - 250 ores - 236
Sun Oct 2 17 : 35 : 20 2016  SCORING : edit 742245860 huggle - 1962 ores - 270
Sun Oct 2 17 : 35 : 18 2016  SCORING : edit 742245856 huggle - 1313 ores - 141
Sun Oct 2 17 : 35 : 15 2016  SCORING : edit 742245853 huggle - 73 ores - 93
Sun Oct 2 17 : 35 : 14 2016  SCORING : edit 742245850 huggle 8 ores 4
Sun Oct 2 17 : 35 : 12 2016  SCORING : edit 742245837 huggle 285 ores 65
Sun Oct 2 17 : 35 : 09 2016  SCORING: edit 742245840 huggle 186 ores - 234
Sun Oct 2 17 : 35 : 09 2016  SCORING: edit 742245841 huggle - 1156 ores - 260
Sun Oct 2 17 : 35 : 07 2016  SCORING : edit 742245838 huggle - 833 ores - 253
Sun Oct 2 17 : 35 : 05 2016  SCORING : edit 742245836 huggle 184 ores 154
Sun Oct 2 17 : 35 : 04 2016  SCORING : edit 742245834 huggle - 185 ores - 205
Sun Oct 2 17 : 35 : 04 2016  SCORING : edit 742245835 huggle - 212 ores - 194
Sun Oct 2 17 : 35 : 04 2016  SCORING : edit 742245833 huggle - 527 ores - 245
Sun Oct 2 17 : 35 : 01 2016  SCORING : edit 742245823 huggle 386 ores 166
Sun Oct 2 17 : 35 : 00 2016  SCORING : edit 742245826 huggle 362 ores 242
Sun Oct 2 17 : 34 : 58 2016  SCORING : edit 742245828 huggle - 1341 ores - 273
Sun Oct 2 17 : 34 : 57 2016  SCORING : edit 742245818 huggle 144 ores 114
Sun Oct 2 17 : 34 : 56 2016  SCORING : edit 742245822 huggle - 831 ores - 253
Sun Oct 2 17 : 34 : 55 2016  SCORING : edit 742245821 huggle - 885 ores - 255
Sun Oct 2 17 : 34 : 55 2016  SCORING : edit 742245820 huggle - 584 ores - 258
Sun Oct 2 17 : 34 : 51 2016  SCORING : edit 742245813 huggle - 637 ores - 209
Sun Oct 2 17 : 34 : 50 2016  SCORING : edit 742245810 huggle 144 ores - 76

Hey @Petrb, it seems like there might be something strange with this data.

E.g. let's take a few observations at look at the ORES predictions.

Sun Oct 2 17 : 36 : 18 2016  SCORING : edit 742245970 huggle - 903 ores - 229
Sun Oct 2 17 : 36 : 17 2016  SCORING : edit 742245969 huggle - 1033 ores 171
Sun Oct 2 17 : 36 : 13 2016  SCORING : edit 742245966 huggle - 987 ores 1

These don't scale to the three values (229, 171, 1) in any way. Just in case you are using the old "reverted" model, I checked on that too:

Honestly, not that much different.

Hey, this is just because the output I generated has really bad format, there is extra space in - that got there somehow when I was pasting it to my editor, it's a minus :)

so scores are negative, meaning it does correspond to your percents

-229 = 6.6%
+171 = 78%
+1 = 42%

I've been working from an etherpad. See https://etherpad.wikimedia.org/p/ores_huggle_comparison

I'm about half-way through manually labeling edits.

I am just wondering, can we close this task? Huggle /does/ support ORES for long time, if there is a specific feature of ORES that isn't implemented in Huggle, then that would need to be specified in task description, right now I have absolutely no idea what to fix.

Petrb claimed this task.

In case there is some ORES feature that needs to be implemented, please give me specs, either reopen this or make a new task, thanks!