Page MenuHomePhabricator

Explore OSM integration for ORES
Open, LowPublic

Description

I (@Halfak) talked to some OSM folks about how they are building something that is very similar to ORES. Here are my notes:

* Working with OSM for a while -- engineering priorities.
* Engineer.  Started working on building dev applications for validation.  Focusing on detection and ML stuff.
* Engineer.  Recently started working on OSM.  Validation.
* Thinking about validation and tools for about a year.  Manual labeling.  Rule based things.

-----

25,000 changesets per day.

Review changesets as they happen using tools. 
 - How long does it take to review the average changeset?
 - How do people coordinate reviews?  (Think patrolled flag)
  - "My area" using a geo filter
  - Are there areas that under-patrolled?
  - Watchlists?  
   - Bounding box --> RSS feed (low adoption, bad user experience)
  - Tools are outside the OSM infrastructure 
  - Have some cron jobs that look for constraint violations --> micro-tasking managers
  - OSM discourages automatic edits
 - How many people do this and do they need rights/permissions?
 - What are ya'll doing with ML?  How have you formalized the problem?

My general sense is that they're building something that is so like ORES it's absurd. I think we should explore creating a "changeset oriented" feature tree. And a revscoring.extractors.osm.Extractor.

We might also want to refactor the whole library so that the mediawiki-specific bits are moved elsewhere.

  • revscoring --> (scoring, mwscores, osmscores)

Event Timeline

Halfak moved this task from Unsorted to New development on the Machine-Learning-Team board.

At Wikimania, @Halfak wrote up some notes about revscoring’s current architecture and how it could be modified to accommodate OpenStreetMap changesets and features.

Mapbox’s data tooling team has been working on a vandalism detection system reminiscent of ORES/revscoring called Gabbar. That’s probably the tool mentioned in the original post that could be folded into revscoring.