Page MenuHomePhabricator

Initial Task API Spec
Closed, ResolvedPublic

Description

Context

This is "dummy" API spec for us to comment and discuss over for the task API which will enable image recommendations for both Android and Growth.

  • As a mobile reader (familiar with editing on my device)
    • When I am reading an article with no images
    • I want to see any image(s) that could be used to illustrate the article,
    • so that I can both gain a better understanding of the topic, as well as contribute to helping others who read the article in future.

Scenarios

Provide a list of tasks of type "add image" or the "Jane Goodall" page on English Wikipedia

URI: https://api.wikimedia.org/tasks/v1/wikipedia/en/task
GET Request
The following parameters could be inlined... just formatting as a request payload for now

{
  "page_title": "Jane Goodall",
  "type": "add_image",
  "limit": 10 //default set to 100?
}

Response

{
  "tasks":[
    {
      "id": "36f270fd-8f87-4ce9-ba03-19488fb86843",
      "type": "add_image",
      "page":{
        "id": 9458,
        "title": "Jane Goodall",
        "wiki": "enwikipedia"
      },
      "images": [
        {
          "wiki": "commonswiki",
          "file": "File:Jane Goodall at TEDGlobal 2007.jpg"
        },
        {
          "wiki": "commonswiki",
          "file": "File:Jane Goodall 2015.jpg"
        },
        {
          "wiki": "commonswiki",
          "file": "File:Jane Goodall at RS Hungary.JPG"
        }
      ]
    }
  ]
}
User adds the suggested image to the page and completes the task

URI: https://api.wikimedia.org/tasks/v1/wikipedia/en/task/{task_id}
POST Request

{
  "decision": <"Rejected", "Accepted", "Skipped">
  "user": ?
  "reason":
}

Response

200
User skips the task with no reason provided

URI: https://api.wikimedia.org/tasks/v1/wikipedia/en/task/{task_id}
POST Request

{
  "decision": <"Rejected", "Accepted", "Skipped">
  "user": ?
  "reason":
}

Response

200

Acceptance Criteria

  • Agree upon what our initial API spec will be for the Android implementation of image recommendations using the Task API

Event Timeline

I'd hoped to reply today with some spec details, but as I continued reading various docs, I kept seeing additional things to consider. So instead, I'll post a random-ish collection of my current thoughts.

Here are some assumptions I'm making. I'm listing these for discussion, in case anyone disagrees, or has other assumptions to add.

  1. the Task API will be a REST API exposed via api.wikimedia.org (it may or may not also be exposed in other ways, depending on implementation details)
  2. tasks will be project-specific (we'll display tasks for English Wikipedia or French Wikipedia, but not tasks spanning both)
  3. the Task API will be agnostic as to task types (we won't have separate "links" and "image" endpoints)

These assumptions all fit well with the suggested base url of https://api.wikimedia.org/tasks/v1/wikipedia/en/
Of course, that's an example and we could in theory eventually have urls like https://api.wikimedia.org/tasks/v1/wikibooks/fr/

I expect that for requests that specify a specific page or pages, we'll use page titles instead of page ids. Maybe there's somewhere that we query by page id already, but I'm not familiar with it - I usually see us using titles.

One requirements documents says this:

Bot writer sends in a parameter for target wiki and receives a list of unillustrated pages on that wiki OR provides a set of articles they would like to work on (this could also possibly be something like all articles in a category)

This is for a "later" implementation, and I don't expect to support all that in the first version. But I would like the design of the API endpoint to be expandable to do all that. I see a few implications from that requirements sentence:

  • the task feed can be filtered by task type(s)
  • the task feed can be filtered by page title(s)
  • the task feed can be filtered by categor(y/ies)

It'd probably be a good idea for clients to be able to specify a maximum number of results to return. (The server should probably also enforce its own maximum, and ignore any attempts by the client to exceed that.)

From that same doc, I see these sentences:

Depending on the bot writer, uninillustrated articles but also potentially adding images to underillustrated articles via galleries at the end of the article.

Ideally, we'd have some sort of confidence score that bot writers can use to fine tune image match quality per wiki. Cebuano, which is basically 100% bot written, may be okay with "Meh" matches but a wiki like Swedish could require higher quality.

We will need to be able to restrict image matches to certain sources, i.e. only to matches that come from the Wikidata image or the Commons category, but not to those that come from matching text strings.

These imply that we may need to send per-task-type filtering parameters. In other words, it isn't sufficient to say "give me image recommendations", we need a way for clients to say things like "give me image recommendations from Wikidata with a confidence score of at least 80%".

I have no data to support this, but it feels unreasonable for the feed to supply full details of every task in the feed. So I'm expecting we'll want a details endpoint, likely on a per-page basis.

We'll also want an endpoint allowing clients to report back how the user actioned a task (accepted, rejected, skipped, etc.).

  1. the Task API will be a REST API exposed via api.wikimedia.org (it may or may not also be exposed in other ways, depending on implementation details)
  2. tasks will be project-specific (we'll display tasks for English Wikipedia or French Wikipedia, but not tasks spanning both)
  3. the Task API will be agnostic as to task types (we won't have separate "links" and "image" endpoints)

+1

I expect that for requests that specify a specific page or pages, we'll use page titles instead of page ids. Maybe there's somewhere that we query by page id already, but I'm not familiar with it - I usually see us using titles.

Have replaced our request for page_id to page_title

the task feed can be filtered by task type(s)
the task feed can be filtered by page title(s)
the task feed can be filtered by categor(y/ies)

Agreed, these sound like v(n) features

It'd probably be a good idea for clients to be able to specify a maximum number of results to return. (The server should probably also enforce its own maximum, and ignore any attempts by the client to exceed that.)

I've put down a limit param to include. To be clear, this is for setting the limit of tasks that appear per request.
I imagine there is a v(n) feature to also provide a limit for the number of images provided per "add-image" type task

These imply that we may need to send per-task-type filtering parameters. In other words, it isn't sufficient to say "give me image recommendations", we need a way for clients to say things like "give me image recommendations from Wikidata with a confidence score of at least 80%".

Given we've decided to not have unique endpoints per task-type (e.g. /tasks/add_image or /tasks/add_link) I am wondering how might we provide this cascade of filtering that is intuitive to the user?

I have no data to support this, but it feels unreasonable for the feed to supply full details of every task in the feed. So I'm expecting we'll want a details endpoint, likely on a per-page basis.

When we hear the requirement of "give me a list of all unillustrated articles" I imagine this could be done in a few ways:

  1. Search API provides articles filtering by has_image=false
  2. Assuming that all "add-image" tasks are only for unillustrated articles, one could query all tasks and filter by type "add-image". This case I'd imagine just a single endpoint that gives a full-fidelity task feed
  3. Task feed queries for "add-image" type tasks and gets list of pages:
[ 
"page": "Cat",
    "tasks":[
    {
      "id": <UUID>,
      "type": "add-image"
    },
    {
      "id": <UUID>,
      "type": "add-image"
    }
  ]
]

and then when a user finds the page they want to claim, the client will query with the given task_id to get the full-fidelity details for the task

We'll also want an endpoint allowing clients to report back how the user actioned a task (accepted, rejected, skipped, etc.).

I've added a guess at what this endpoint might look like under "Send users' decision and/or reason on a given task"

Given we've decided to not have unique endpoints per task-type (e.g. /tasks/add_image or /tasks/add_link) I am wondering how might we provide this cascade of filtering that is intuitive to the user?

This could be actions as a query param? GET /tasks?actions=add_image,add_link

For the Send users' decision and/or reason on a given task section, what would happen when a user accepts a task, and then fails to complete it? They get to the page and can't find an image within a few minutes, they give up and navigate away from the page. Would that task be marked as "accepted" and then we would need to determine a set of rules that would send it back to an incomplete status?

Additional question (just from my lack of background on the project!), but what/how is a task created? I understand they are going to be stored in ElasticSearch but do they come from a Search API query? Or somewhere else?

Also, I think it might be helpful to have a user story similar to the one in the current Context section of this task for the "give me all the tasks" scenario! So we can distinguish the clear line between the 2 distinct situations.

From the "stuff we have to decide" perspective, there's a little confusion in our current discussion regarding REST URLs. The task description currently contains a url like this, for POSTing a user's decision regarding a task:

https://api.wikimedia.org/tasks/v1/wikipedia/en/{task_id}

In that URL, "tasks" is the component (meaning this endpoint is part of the "Tasks" API). Maybe that's the name we keep, maybe it isn't. But in that URL, "tasks" is not the resource. In fact, that URL contains no resource at all. Relevant link: T232485: RFC: Core REST API namespace and version

For REST, we'll need to decide what we want our resources to be (that's one of the early defining steps in creating a REST API). If we decided to have a "task" resource, the url would be something like:

https://api.wikimedia.org/tasks/v1/wikipedia/en/task/{task_id}

However, that assumes we have task ids that are unique across all task types, and I'm not certain we'll have that. Or that "task" is even a resource that we want. It makes intuitive sense, but I don't yet know if our data will support it.

Finally,, as a bit of a rabbit trail, there's always a little tension about whether resource names should be singular (task) or plural (tasks). Different people argue different merits. I have no interest in a religious war, and no strong opinions. However, I note that we seem to have trended toward singular in Core REST so far, using resources like "page" (instead of "pages") and "revision" (instead of "revisions). You can see examples at:

https://gerrit.wikimedia.org/g/mediawiki/core/+/afc51ddd4261fabe1b623529de487baef8e6b1bd/includes/Rest/coreRoutes.json

Given we've decided to not have unique endpoints per task-type (e.g. /tasks/add_image or /tasks/add_link) I am wondering how might we provide this cascade of filtering that is intuitive to the user?

This could be actions as a query param? GET /tasks?actions=add_image,add_link

Yep, query params are great for filtering (and pagination, if we end up needing to do that anywhere). Offhand, I think I'd personally prefer to call it "type" instead of "actions", but hopefully naming become more clear as we learn more.

For the Send users' decision and/or reason on a given task section, what would happen when a user accepts a task, and then fails to complete it? They get to the page and can't find an image within a few minutes, they give up and navigate away from the page. Would that task be marked as "accepted" and then we would need to determine a set of rules that would send it back to an incomplete status?

I'm curious about this as well (and have raised the question elsewhere). It implies a level of tracking that I'm not sure we'll have, at least initially.

Additional question (just from my lack of background on the project!), but what/how is a task created? I understand they are going to be stored in ElasticSearch but do they come from a Search API query? Or somewhere else?

I think it is possible that different tasks are created/stored in different ways. We're starting with Image Recommendations (which is great!) but we're trying to think in terms of a generic Task API. I'm envisioning the Task API as a middle layer that unites task-related data from potentially disparate data sources into one format that's convenient and consistent for clients to access. In other words, it may turn out that Image Recommendation tasks are created/stored in one way, but some other sort of task that we haven't even thought about yet is created/stored in a radically different way. The job of the Task API in this context is to know how to talk to whatever different backends are involved and make those differences transparent to clients. If it turns out that we can't achieve that, then there doesn't seem to be much point in a Task API, and we should instead just make a bunch of specific APIs for different task types. But I'm optimistic that we can.

As for where Image Recommendations gets its data, I thought I had a fuzzy understanding, but now I'm a little lost. The "Image Matching Algorithm" (formerly known as "Miram's Algorithm") seems like the primary and most reliable source. But I've seen discussion of fallback to MediaSearch. And I'm not clear how much of this is pregenerated and how much can occur in real time. Or how any of this data is stored. I've read back though the MediaSearch Integration Meeting Notes (https://docs.google.com/document/d/1u-Cv_MebulU_6-3fZyTIdL7in7WFx1oEO6cBgqOy_wQ/edit) and while I've expanded my technical vocabulary, I'm still not sure exactly what we're doing in the backend. Notable (and slightly paraphrased by me) quotes from that document:

    • We will have one API that combines results from Miriam’s algorithm with results from Media Search.
    • Different clients may need to specify different precision requirements. Bot clients will need a higher precision than user-facing clients, because they don't have a human sanity check on their edits.
    • While we've talked about storing recommendations outside of search, it may be possible to store directly in search. Growth's use case requires recommendations to be fast but is agnostic on storage location.
    • We may want a reverse property in Wikidata
    • We may or may not want to use either the Commons index or a new specific index.
  • Whenever you add a new field to Commons you have to reindex and that takes a couple weeks and is a pain.

Disclaimer: I could be wrong about some or all of that. I'd be happy to be corrected.

Cool thanks @BPirkle! I gathered a lot of the same conclusions from watching the meeting recording back. Couple more things I was wondering about:

The job of the Task API in this context is to know how to talk to whatever different backends are involved and make those differences transparent to clients.

In the case of the Image Recommendation specifically, we can say for certain this will be an ElasticSearch index right, whether it is the existing Commons ElasticSearch instance or a separate, slimmer index?

Different clients may need to specify different precision requirements. Bot clients will need a higher precision than user-facing clients, because they don't have a human sanity check on their edits.

From the meeting recording I was under the impression that it's not certain whether there will be a precision measure for MediaSearch queries, only for the Image Matching Algorithm. Another question I have is, if we are querying ElasticSearch is that precision score based off the ElasticSearch "score" or something else?

We may want a reverse property in Wikidata

Do we know what happens if a given item doesn't exist in Wikidata?

The job of the Task API in this context is to know how to talk to whatever different backends are involved and make those differences transparent to clients.

In the case of the Image Recommendation specifically, we can say for certain this will be an ElasticSearch index right, whether it is the existing Commons ElasticSearch instance or a separate, slimmer index?

I'm pretty confident that some ElasticSearch index will be involved in some way.

Different clients may need to specify different precision requirements. Bot clients will need a higher precision than user-facing clients, because they don't have a human sanity check on their edits.

From the meeting recording I was under the impression that it's not certain whether there will be a precision measure for MediaSearch queries, only for the Image Matching Algorithm. Another question I have is, if we are querying ElasticSearch is that precision score based off the ElasticSearch "score" or something else?

Yeah, IIRC there was some discussion of "precision" vs "source". I'm not sure if there will be a good way to combine those into one filter, or if we'll need multiple filters. It may be that bot clients will need a way to completely exclude the MediaSearch recommendations, and we'll need some sort of "source" filter for that. But then again, if we want to strictly use a numerical confidence score, maybe MediaSearch can just never assign itself a confidence of higher than some certain threshold, giving bot clients a way to conveniently exclude its recommendations. Downside is that feels hackish and requires clients to have more knowledge of implementations than I'd like.

We may want a reverse property in Wikidata

Do we know what happens if a given item doesn't exist in Wikidata?

Heck, I'm not certain what happens if it DOES exist. ;-)

sdkim updated the task description. (Show Details)
sdkim updated the task description. (Show Details)

I think trying to predict all the possible criteria for selecting tasks (category, ORES topic, page list etc) won't work well in practice, there are just too many potential use cases. It would be preferable to commit to a CirrusSearch-backed implementation, and expose (part of) the search query as a parameter. Or, make the API as a means of identifying a list of articles optional, and as an alternative allow clients to use the normal search API (and provide a search keyword for selecting a certain type of recommendation, which will happen / is happening anyway) and "trade in" the result list for a list of recommendations (in which case the task API's GET endpoint would take a list of page titles or maybe page IDs).
For Growth features the latter would be more convenient as we already have a search API based framework and it would be easy to plug a new kind of search query into that, while mixing it up with a different API not so much.


Making the user a parameter in the POST endpoint doesn't seem useful - when would that differ from the user who is submitting the request? Relatedly, how would web clients use api.wikimedia.org? POST requests need to be authenticated and there won't be any session cookie on that domain. Exposing the API over the individual wiki domains shouldn't be optional, IMO.


I have no data to support this, but it feels unreasonable for the feed to supply full details of every task in the feed. So I'm expecting we'll want a details endpoint, likely on a per-page basis.

Relatedly, what's the relationship between a page and a task? Can a page have many tasks (of the same type) which need to be presented together for the user? For example, for link recommendation tasks the ML service returns a list of recommended links and their positions in the article, and the user will receive those recommendations as a list and needs to decide about each one before saving the page. If the "task" here is the list of all those links, then the POST endpoint would have to look differently (the user can accept some of the links and reject others). If each link is a separate task, then 1) having to make an API query for each one will not be convenient, 2) when using the GET endpoint to fetch a list of pages to work on, I want to get a fixed number of pages, not a fixed number of tasks, 3) if the internal implementation is based on CirrusSearch, that returns a list of pages, and converting that to a list of tasks in the API in a performant way seems challenging.

So IMO it would be better to ether a GET endpoint which gives a list of pages which have tasks, and another GET endpoint which takes a page or set of pages and returns a list of tasks for each. Or declare that one page = one task but then the POST endpoint needs to have a task type specific body format. (Which would mean it's not really possible for the API to apply some sort of uniform processing to all task types. Although that might well be the case anyway. It depends on what that endpoint is actually supposed to do, which I think hasn't really been discussed so far.)

We'll also want an endpoint allowing clients to report back how the user actioned a task (accepted, rejected, skipped, etc.).

FWIW our planned MVP POST endpoint for GrowthExperiments link recommendations is here.

Other than the cardinality issue (we submit user decisions about many links at once), the difference is that 1) it takes a revision ID instead of a task ID, 2) it takes another revision ID which is the edit the user just made (in case the recommendation was accepted). The second might be a useful addition to the task API (though of course it really depends on what exactly the API is supposed to do with the user's decision; and one could also take the approach that sending the user's decision is itself the mechanism by which the edit happens, although I think that would invite a lot of trouble).

About the first: the current API signature commits the implementation to store metadata about each recommendation, instead of just fetching them on the fly (as the POST endpoint probably needs to be able to tie the task ID back to a page). I wonder if that could lead to unneeded complexity. (What are the storage needs? When would tasks be purged?)
The other possible approach would be to identify recommendations with a revision ID, task type pair, which would work well for link and image recommendations (assuming the one task per page approach above), although I can imagine future recommendation types where it wouldn't.

CBogen subscribed.

Boldly closing this old task since we've moved beyond this initial POC and into production with the image suggestions pipeline. Feel free to reopen if needed.