Introduce ORES rvprop
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Ladsgroup
	Aug 22 2016, 9:09 PM

Description

See T122689#1939440

We need to use ORES in several places but most importantly we need it as prop of revision (rvprop) e.g. in most of cases we query like this and result is like this but if users add "oresscore" to rvprop a new result should have been returned like this. An extra json part:
"oresscore": {
    "damaging": {
        "true" : 0.4320,
        "false": 0.5680
     }
}
Once we agreed on design, implementing them is easy, given that everything is stored in ores_classification table.

Details

	Subject	Repo	Branch	Lines +/-
	Action API integration for ORES	mediawiki/extensions/ORES	master	+730 -10
	API: Add hooks for ApiQueryBase's query and row-processing	mediawiki/core	master	+102 -12

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Resolved	None	T143895 [Epic] Implement ORES service proxy in api.php
Resolved	Anomie	T143614 Introduce ORES rvprop
Resolved	Anomie	T147939 Add ability to hook into WatchedItemQueryService and ApiQueryWatchlist

Event Timeline

Ladsgroup created this task.Aug 22 2016, 9:09 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 22 2016, 9:09 PM

Anomie subscribed.Aug 23 2016, 1:54 PM

Ladsgroup moved this task from Backlog to Prioritized on the MediaWiki-extensions-ORES board.Aug 24 2016, 8:25 PM

Halfak added a parent task: T143895: [Epic] Implement ORES service proxy in api.php.Aug 25 2016, 2:13 PM

Halfak triaged this task as Medium priority.Aug 25 2016, 2:29 PM

Halfak moved this task from Unsorted to New development on the Machine-Learning-Team board.

Bianjiang subscribed.Aug 29 2016, 11:35 PM

Halfak mentioned this in T122689: [Discuss] api.php integration with ORES.Aug 30 2016, 5:31 PM

Anomie updated the task description. (Show Details)Sep 19 2016, 6:31 PM

One open question is what to do for revisions that are not already saved in the database:

Return no scores for those revisions
Fetch scores for those revisions
- How many would be sane to fetch in one API request? We'd need to choose a limit that won't take too long to execute.
- If there are more than that many revisions that need fetching, is it worth the added complexity of scheduling jobs to hopefully load-and-cache the remaining revisions before the client submits the continuation?

We might answer the question differently by endpoint, for example list=recentchanges and list=watchlist might return no score (to avoid hundreds of clients all fetching scores for the same just-created revision before the FetchScoreJob runs) while the others fetch.

In T143614#2652771, @Anomie wrote:

How many would be sane to fetch in one API request? We'd need to choose a limit that won't take too long to execute.

In the service endpoint 50 is our safest bet, specially on recent change and watchlist because the data probably already stored in redis cache in ores service.

If there are more than that many revisions that need fetching, is it worth the added complexity of scheduling jobs to hopefully load-and-cache the remaining revisions before the client submits the continuation?

We already have an abstraction layer to trigger jobs of scoring and storing revisions. We can use it.

In T143614#2664477, @Ladsgroup wrote:

We already have an abstraction layer to trigger jobs of scoring and storing revisions. We can use it.

Although I note FetchScoreJob only does one revision at a time. BTW, what is the 'precache' parameter it passes to ORES in the one case it's currently triggered?

In T143614#2664820, @Anomie wrote:

Although I note FetchScoreJob only does one revision at a time.

We can work on it and make it accept more than one, I'm guessing it won't be hard.

BTW, what is the 'precache' parameter it passes to ORES in the one case it's currently triggered?

It's for the ORES service to understand source of requests, precache is when the edit is made. So we don't need precache in rvprop or other API modules.

In T143614#2665385, @Ladsgroup wrote:

In T143614#2664820, @Anomie wrote:

Although I note FetchScoreJob only does one revision at a time.

We can work on it and make it accept more than one, I'm guessing it won't be hard.

Looking at it closer, it accepts multiple revids for the revid parameter without any issue. Nothing in the job actually depends on being passed only one revid.

Change 313830 had a related patch set uploaded (by Anomie):
API: Add hooks for ApiQueryBase's query and row-processing

https://gerrit.wikimedia.org/r/313830

Change 313831 had a related patch set uploaded (by Anomie):
Action API integration for ORES

https://gerrit.wikimedia.org/r/313831

Change 313830 merged by jenkins-bot:
API: Add hooks for ApiQueryBase's query and row-processing

https://gerrit.wikimedia.org/r/313830

Ladsgroup edited projects, added Machine-Learning-Team (Active Tasks); removed Machine-Learning-Team.Oct 7 2016, 1:13 PM

Ladsgroup moved this task from Parked to Completed on the Machine-Learning-Team (Active Tasks) board.Oct 7 2016, 1:15 PM

Halfak closed this task as Resolved.Oct 11 2016, 11:51 PM

Halfak claimed this task.