Page MenuHomePhabricator

Feature injection does not appear to work in ores-legacy
Open, Needs TriagePublic1 Estimated Story Points

Event Timeline

We currently don't support feature injection, what is the use case for it? From our traffic analysis this is not a feature that is really used. Ores is being deprecated so we'd like to keep the features to maintain as few as possible.

  1. People use the functionality to use ORES to make recommendations. E.g. the WikiEdu outreach dashboard uses feature injection to make predictions about hypothetical future states of an article to recommend the changes that are most likely to raise article quality to student editors.
  2. It's very useful as an auditing tool. E.g. "Why did I get this prediction?" It allows you to ask "Would I still have gotten this prediction if the editor was registered? If they had been editing a talk page? " Etc. Counterfactuals and their use in model auditing is well documented. (See https://scholar.google.com/scholar?hl=en&as_sdt=0%2C48&q=counterfactuals+auditing+models&btnG= for several papers in this space)
  3. Predicting the quality of an edit before it is saved. This was used by experiments with Tor editors -- allowing them to save a *proposed* edit and then have that proposed edit get scored before it is saved. E.g. https://mako.cc/academic/tran_etal-tor_users_wikipedia-DRAFT.pdf

The rationale behind this and other features is documented in https://upload.wikimedia.org/wikipedia/commons/a/a9/ORES_-_Lowering_Barriers_with_Participatory_Machine_Learning_in_Wikipedia.pdf

Relevant extract:

5.2.1 Dependency injection. When we originally developed ORES, we designed our feature
engineering strategy based on a dependency injection framework36. A specific feature used in
prediction (e.g., number of references) depends on one or more datasources (e.g. article text). Many
different features can depend on the same datasource. A model uses a sampling of features in order
to make predictions. A dependency solver allowed us to efficiently and flexibly gather and process
the data necessary for generating the features for a model — initially a purely technical decision.
After working with ORES’ users, we received requests for ORES to generate scores for edits
before they were saved, as well as to help explore the reasons behind some of the predictions. After
a long consultation, we realized we could provide our users with direct access to the features that
ORES used for making predictions and let those users inject features and even the datasources they
depend on. A user can gather a score for an edit or article in Wikipedia, then request a new scoring
job with one of those features or underlying datasources modified to see how the prediction would
35https://it.wikipedia.org/wiki/Progetto:Patrolling/ORES
36https://en.wikipedia.org/wiki/Dependency_injection
Proc. ACM Hum.-Comput. Interact., Vol. 4, No. CSCW2, Article 148. Publication date: October 2020.
ORES 148:17
change. For example, how does ORES differently judge edits from unregistered (anon) vs registered
editors? Figure 5 demonstrates two prediction requests to ORES with features injected.

"damaging": {
"score": {
"prediction": false,
"probability": {
"false": 0.938910157824447,
"true": 0.06108984217555305 } } }

(a) Prediction with anon = false injected

"damaging": {
"score": {
"prediction": false,
"probability": {
"false": 0.9124151990561908,
"true": 0.0875848009438092 } } }

(b) Prediction with anon = true injected
Fig. 5. Two “damaging” predictions about the same edit are listed for ORES. In one case, ORES scores the
prediction assuming the editor is unregistered (anon) and in the other, ORES assumes the editor is registered.
Figure 5a shows that ORES’ “damaging” model concludes the edit is not damaging with 93.9%
confidence. Figure 5b shows the prediction if the edit were saved by an anonymous editor. ORES
would still conclude that the edit was not damaging, but with less confidence (91.2%). By following
a pattern like this, we better understand how ORES prediction models account for anonymity with
practical examples. End users of ORES can inject raw text of an edit to see the features extracted
and the prediction, without making an edit at all.

I checked in with @Sage_Wiki_Ed who said they are no longer using feature injection in their dashboard.

calbon set the point value for this task to 1.Nov 2 2023, 7:02 PM
calbon moved this task from In Progress to Ready To Go on the Machine-Learning-Team board.
calbon triaged this task as Medium priority.Nov 2 2023, 7:26 PM
calbon raised the priority of this task from Medium to Needs Triage.Dec 20 2023, 3:27 PM
calbon moved this task from Ready To Go to Backlog/ORES Migration on the Machine-Learning-Team board.