Page MenuHomePhabricator

Live test of ORES extension
Closed, ResolvedPublic

Description

I'll need @awight to give me steps to install and test on vagrant.

This card is done when the live system can be tested. Filtering by score and version updates can be demo'd.

Event Timeline

Halfak claimed this task.
Halfak raised the priority of this task from to Needs Triage.
Halfak updated the task description. (Show Details)
Halfak moved this task to Parked on the Machine-Learning-Team (Active Tasks) board.
Halfak added subscribers: Halfak, awight.

Thanks for making this task!

The steps to install are,

# Clone the ORES extension into mediawiki/extensions/ORES
git clone https://gerrit.wikimedia.org/r/p/mediawiki/extensions/ORES.git extensions/ORES

# Load the ORES extension using either the new or old methods.  New method, add this line to LocalSettings.php
wfLoadExtensions( 'ORES' );

# If you want to point to a fake revscoring server--necessary if you want to score things from your own database--then set the config to point to your server's base URL.
$wgOresBaseUrl = 'http://localhost:9000/';

# Run the maintenance script to populate model versions:
mwscript extensions/ORES/maintenance/CheckModelVersions.php

Now, you should be able to edit a page, then view your local wiki's Special:RecentChanges page. Nothing will appear unless the revision score has at least an 80% predicted probability of being reverted. If this threshold is tripped, you should see the letter "R" to the left of the revision and a tooltip will have more details.

@Halfak
Can you say more about what "filtering by score" should look like in the Recent Changes view? It just dawned on me that the hide logic I ported is not what we wanted--it hides the flag, whereas now I see that we should be hiding any non-"damaging" rows. Is that right?

@awight Yeah. That's right. We'll likely want to hide the whole row.

I'm trying to test in both localhost and here but there are some issues:

  • It's not showing any scores
    • It seems ores_classification is not being populated under any circumstances (maybe hooks are not working)
    • ores-model is populated
  • "Hide reverted" is better to be changed to something else
  • There is no "r" in legend of RC

This is as far as I can say for now, I'd really appreciate if @awight take a look at the this. If you want I can add him (or @Legoktm or @Halfak) to the service group.

Currently it shows "Hide revert predictions" instead of just "Hide revert".

Sorry, I meant "hidereverted=1" in URL

@Ladsgroup
Great, thank you for doing this QA! It sounds like you're using the head of master, please use the tip of the branch for review instead: https://gerrit.wikimedia.org/r/#/c/258042/ -- or if we can get these patches merged that's even better.

Once I'm added to the labs project, I can check the other configuration. You'll need to point at an ORES server that's scoring the test wiki, or fake it out by manipulating the revision.rev_id autoincrement and wiki ID to pull scores for production wiki content.

Okay, great. I made the changes and it's working now but I couldn't get some high probability edits now to check in great depth. I'll do some edits by bots tomorrow morning.

I tried to add you but I couldn't get your username. For future reference Do you have account in tools?

Thanks again for your great work

My Labs username is just "Awight".

The easiest workaround, now that it sounds like you have data in ores_classification, is to set $wgOresDamagingThreshold to something lower.

In that case, I agree (see also comments on https://gerrit.wikimedia.org/r/#/c/256641/2/includes/Hooks.php,unified).

Sorry, I meant "hidereverted=1" in URL

My Labs username is just "Awight".

The easiest workaround, now that it sounds like you have data in ores_classification, is to set $wgOresDamagingThreshold to something lower.

I changed $wgOresDamagingThreshold to 0.3 and no change, Also the the table is still empty:

MariaDB [s52782__wikidb_p]> select * from ores_classification limit 5;
Empty set (0.02 sec)

I made about 140 edits today, still db is empty and no "r" beside edits

Hey folks. I just got the psuedo-model deployed to staging. Check out http://ores-staging.wmflabs.org/scores/testwiki/reverted/

We should be able to get it on the prod cluster tomorrow. Woot!

I changed $wgOresDamagingThreshold to 0.3 and no change, Also the the table is still empty:

Okay, that's a step behind where I imagined we were. The tricky thing about running a test instance is that you'll need an ORES server that knows about your test wiki's database, and to be configured to score new revisions. For example, if your test database is called testwiki, then when you go to http://ores.wmflabs.org/scores/testwiki/?models=reverted|wp10&revids=123 you should see a score.

I've been taking a shortcut for local testing, where I call my local database "enwiki" (actually, I hack includes/GlobalFunctions.php wfWikiId() to return "enwiki"), and then fast-forward the revision IDs to something that exists in enwiki, e.g. insert into revision set rev_id=577208657, so now when I create new revisions I'll be pulling scores for actual edits on enwiki.

I see I'm in the Labs ORES project now, so I can poke at our configuration later today. Thanks!

We'll have a model for testwiki deployed to ores.wmflabs.org within 24 hours. It will work against any wiki by taking the last two digits in the rev_id, reversing them and converting them to a probability.

E.g. for any wiki, rev_id=2875639 will have a 93% probability of being reverted. We do zero padding, so rev_id=9 will have a 90% probability.

This reversing is to make sure that we have a wider range of scores for a smaller block of rev_ids (0-9 gets 0-90%, 10-19 gets 1-91%, etc.)

New models are up on ores.wmflabs.org

http://ores.wmflabs.org/scores/testwiki/reverted/?revids=1|2|3|4|5|6|7|8|9

{
  "1": {
    "prediction": false,
    "probabilities": {
      "false": 0.9,
      "true": 0.1
    }
  },
  "2": {
    "prediction": false,
    "probabilities": {
      "false": 0.8,
      "true": 0.2
    }
  },
  "3": {
    "prediction": false,
    "probabilities": {
      "false": 0.7,
      "true": 0.3
    }
  },
  "4": {
    "prediction": false,
    "probabilities": {
      "false": 0.6,
      "true": 0.4
    }
  },
  "5": {
    "prediction": false,
    "probabilities": {
      "false": 0.5,
      "true": 0.5
    }
  },
  "6": {
    "prediction": true,
    "probabilities": {
      "false": 0.4,
      "true": 0.6
    }
  },
  "7": {
    "prediction": true,
    "probabilities": {
      "false": 0.30000000000000004,
      "true": 0.7
    }
  },
  "8": {
    "prediction": true,
    "probabilities": {
      "false": 0.19999999999999996,
      "true": 0.8
    }
  },
  "9": {
    "prediction": true,
    "probabilities": {
      "false": 0.09999999999999998,
      "true": 0.9
    }
  }
}

I just checked, and our toollabs wiki currently has wiki ID s52782__wikidb_p. It sounds like the easiest way to continue testing is if I write a patch that overrides how mw-ext-ORES gets the wiki ID, to report testwiki. I wouldn't want that patch in production, so I'll give it a crazy commit summary...

@Halfak
The extension uses the damaging model now. Please add that model to your test data generator, or just replace reverted.

@Ladsgroup
Please apply the https://gerrit.wikimedia.org/r/#/c/258916/ patch, and remove any overrides on $wgOresBaseUrl, so that it points to the default ores.wmflabs.org.

@awight: I did exactly this since the beginning both for my localhost and ores test wiki but I used enwiki since there wasn't testwiki at that moment. I change it to test wiki and see what happens

Not working, db is still empty.

Okay, I just added @awight and @Halfak to the tools project. What you need to do right now is:

ssh awight@tools.wmflabs.org
become ores
cd public_html/mediawiki

Then you can do whatever you want
also the database is accessible easily by doing (after "become ores"):

sql local
use  s52782__wikidb_p;

Change 258916 had a related patch set uploaded (by Awight):
Add an config variable to override the wiki ID

https://gerrit.wikimedia.org/r/258916

Status:
We need to take the following steps before full functionality is available on the test box.

@Ladsgroup
I'm on the same page now--I updated to the patches above, made an additional livehack to only fetch the "reverted" model, and still nothing happens. I'm trying to track down the error logs for FetchScoreJob, with no luck so far.

Maybe I can blame the Grid job queue, somehow? At least, error_log() called from inside FetchScoreJob does not go to ~/error.log.

Maybe using this helps us but I don't know which logger should be used.

@awight I've an idea. I've got this problem in my localhost too, What should I do to log errors and see what happens. You can do it too by cloning again (from scratch)

Nope, I have no idea. Ask @bd808 as our resident vagrant expert? The grid engine hopefully has nothing to do with this - if you're using gridengine to do anything mediawiki related may god have mercy on your soul.

Change 259211 had a related patch set uploaded (by Awight):
[WIP] Role for the ORES extension

https://gerrit.wikimedia.org/r/259211

Thanks for all the great work, @Ladsgroup @Halfak @yuvipanda!
Stuff is happening here: http://mw-revscoring.wmflabs.org/wiki/Special:RecentChanges

I had to cheat out the new "ores" vagrant role, but the only real sleight-of-hand was to checkout bleeding-edge unreviewed patches on mw-core and the ORES extension.

Now that I see the results, I believe it will be a little annoying for users to see the "This edit needs review." in the "r" and not be able to do anything to actually mark the page as "reviewed", so that the "r" disappears from the edit (as happens for the unpatrolled "!" mark). There should be a way to mark the items as either a false or a true positive, so that the "r" is removed once that happens (and eventually the false/true positive data can be used by ORES developers to improve the models)

Also, I don't think people will like having two things to mark for each edit (patrolled + reviewed).

@He7d3r
+1, We need to design more complete rules for calculating the "r" flag.

I'm a little concerned that we should draw a line somewhere for the initial MVP, but I think you're right that this annoyance is on the wrong side of that line.

What about simply needs_review = ores_is_predicted and !rc_patrolled?

Could you set $wgUseRCPatrol = true; for that wiki so we can have a better understanding of how it will look like in the real world (of wikis using RCpatrol)?

Maybe we can just change the wording of the tooltip, considering that all edits need review. What is different for the edits marked with an "r" is that these are more likely damaging than the others, so maybe the tooltip should just suggest users to review these first. E.g. something in the lines of "This edit has [medium|high] priority for patrollers" or "Reviewing this edit is a high priority" or "Prioritize this edit when reviewing recent changes"..

Another issue I encountered to:
When I'm logged out I can see "r" but when I'm logged in and disabled grouping, it's not being shown and when I enable grouping it works again.
Long story short: It doesn't work when grouping is disabled.

Maybe @yuvipanda knows the answer to this question?

Nope, I have no idea. Ask @bd808 as our resident vagrant expert? The grid engine hopefully has nothing to do with this - if you're using gridengine to do anything mediawiki related may god have mercy on your soul.

The question:

Where would I find logs covering any exceptions thrown from a MediaWiki background job?

Assuming that this is running under MediaWiki-Vagrant, log output from jobs should end up in /srv/mediawiki-vagrant/logs/mediawiki-runJobs.log and/or /srv/mediawiki-vagrant/logs/mediawiki-wiki-debug.log unless they are fatal errors in which case they would be reported by HHVM at /var/log/hhvm/error.log inside the VM.

(The hhvm log location is a mis-feature that I will soon correct to place the logs into /srv/mediawiki-vagrant/logs/hhvm/error.log to match a local MW-Vagrant install.)

@bd808
Thank you! These logs were exactly what I needed--fyi, the real problem is that we had been developing on Tool Labs and outside of vagrant, so nothing was set up. Once we moved to mediawiki-vagrant on Labs, these logs showed up, etc. It was a breeze to debug.

awight set Security to None.

Change 258916 merged by jenkins-bot:
Add an config variable to override the wiki ID

https://gerrit.wikimedia.org/r/258916

Change 259211 merged by jenkins-bot:
Role for the ORES extension

https://gerrit.wikimedia.org/r/259211