Restructure ORES response format to make room for metadata
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Halfak
	Dec 10 2015, 1:23 AM

Description

See https://github.com/wiki-ai/ores/issues/24

We need to be able to include various bits of metadata about the scoring model when returning scores.

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Resolved		None	T122271 [Epic] ORES 1.0.0
		Resolved		Ladsgroup	T121054 Restructure ORES response format to make room for metadata

Event Timeline

Halfak created this task.Dec 10 2015, 1:23 AM

Halfak raised the priority of this task from to Needs Triage.

Halfak updated the task description. (Show Details)

Halfak added a project: Machine-Learning-Team (Active Tasks).

Halfak moved this task to Parked on the Machine-Learning-Team (Active Tasks) board.

Halfak subscribed.

Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptDec 10 2015, 1:23 AM

I like the "meta" version that Helder proposes, but we'd need to also be able to include the info in multi-model mode. Here's what I imagine:

/scores/enwiki?models=damaging|reverted&revids=123456788|123456789&model_info=version|trained|stats

{
  "model_info": {
    "damaging": {
      "version": "0.0.4",
      "trained": 1234567890.00,
      "stats": {
         "auc": 0.8937382,
         "accuracy": 0.8892833
      }
    },
    "goodfaith": {
      "version": "0.0.4",
      "trained": 1234567890.00,
      "stats": {
        "auc": 0.9201838,
        "accuracy": 0.756293
      }
    }
  },
  "scores": {
    "123456789": {
      "damaging": {
        "prediction": false, ...
     },
     "goodfaith": {
       "prediction": true, ...
     }
    },
    "123456788": {
      "damaging": {
        "prediction": true, ...
     },
     "goodfaith": {
       "prediction": true, ...
     }
    }
  }
}

I think that it would then make sense to include the "scores" sub-field in a simple request. E.g.:

/scores/enwiki/damaging/123456788

{
  "scores": {
    "123456788": {
      "damaging": {
        "prediction": true, ...
     },
    }
  }
}

One thing I am struggling with is changing the output format for ORES. If we do this, I'd like to do it once since we'll ask our users to update their code when we deploy. It would also be nice if we could have a transition period. I imagine implement a function that formats a scoring results based on a version number. E.g. ?format=v2 and that the default would stay "v1" for some transition period. This function would look like this:

def format_output(scorer_models, scores, model_info, version="v1"):
    """
    Formats a JSON blob of scores.

    :Parameters:
        scorer_models : `set`
            A set of scorer models used in generating the scores.
        scores : `dict`
            A JSONable dictionary of scores by revision ID and model.
        model_info : `set`
            A set of model information keys that should be includes (e.g. "version", 
            "trained" and/or "stats")
        version : `str`
           An output format version to apply.  "v1" excludes any `model_info`. 
           "v2" returns a more complex JSON document format that can 
           contain `model_info`
    """
    ...

It would return a JSON blob ready for output. If version is set to "v1", the response format could not contain the meta information. If the version is set to "v2", the response format will look like the one above.

Halfak added a project: ORES.Dec 23 2015, 4:21 AM

Halfak set Security to None.

Halfak added a parent task: T122271: [Epic] ORES 1.0.0.Dec 23 2015, 4:24 AM

I'd be nice for developer usability and cacheability to have the version parameter be part of the REST url for the common case of accessing things like /scores/enwiki/damaging/123456788. E.g. as /v2/scores/enwiki/damaging/123456788, or elsewhere in the url.

What benefits does putting the output format in the path give? It seems that what you are imagining is a whole-ly version API where paths might change. We're not planning to change the path here, but this sounds like a good strategy for enabling path changes in the future.

@Ladsgroup and I worked this out: https://etherpad.wikimedia.org/p/ores_response_structure

It seems that we like the "Deeper merged trees" proposal. I'll copy-paste here:

"warnings" []
- "type"
  - "Something bad happened!"
- "message"
  - "This is my message
"notice" []
- "type"
  - "model_deployment_scheduled"
- "message"
  - "2016-02-17T12:34:56Z"
"scores"
- <context>
  - <model>
    - "version"
    - "scores"
      - <rev_id>
        
        {score|error}
    - "info"
      - {model_info}

A scoring request:

/scores/<context>/<model>/<rev_id>/

{"scores": {"<context>": {"<model>": {"version": "<version>", "scores": {"<rev_id>":  {... score or error ...}}}}}}

A model request:

/scores/<context>/<model>/

{"scores":{"<context>":{"<model>": {... model_info ...}}}}

This was merged in https://github.com/wiki-ai/ores/pull/121

Halfak assigned this task to Ladsgroup.Mar 15 2016, 8:38 PM

Halfak moved this task from Parked to Completed on the Machine-Learning-Team (Active Tasks) board.

Is it resolved then?

Yup. I usually do 'em in batch, but please feel free to resolve once tasks make it to the (Done) column on the #revision-scoring-as-a-service board if you beat me to it. :)

• Phabricator_maintenance added a project: User-Ladsgroup.Aug 12 2016, 8:09 PM

Restructure ORES response format to make room for metadataClosed, ResolvedPublicActions

Description

Related ObjectsSearch...

Event Timeline

Restructure ORES response format to make room for metadata
Closed, ResolvedPublic
Actions

Related Objects
Search...