Page MenuHomePhabricator

Recommendation API public end points
Closed, ResolvedPublic

Description

In order for the Recommendation API service to be accessible to clients, we need to expose it publicly via RESTBase. I propose to expose it as https://{domain}/api/rest_v1/data/recommendation/translation/{from_lang}{/seed_article} so as to keep consistent with the current (and future) REST API hierarchy.

Event Timeline

Yup yup, @schana but that PR only defines the properties of the end point. We need to decide where in the hierarchy to put it. Once we do, I will take over the PR and make the necessary changes.

There's some related discussion in T147420, and it came down to this:

I think we're all happy with these APIs living under the /api/rest_v1/ endpoint and having a tag to group them together.

Beyond that, I'll defer to Service's judgement as to where it makes sense to put it for keeping the REST hierarchy consistent.

@schana as discussed, before implementing this path, I'd like you to make an assessment as to

  1. whether different types of recommendations that we're working on will be able to use a consistent path/endpoint
  2. how we expect to support versioning to flag changes in the algorithm (i.e. no change to the endpoints but an improved algorithm)

Also: can you comment on the status of swagger documentation for the API? T144421
Even if this is not explicitly required, I'd like to see it done as part of the official announcement of the RESTbase version of the API.

@DarTar These points have already been considered. Different types of recommendations will slot in to the path as https://{domain}/api/rest_v1/data/recommendation/{recommendation_type}/{params...}. Supporting algorithm versioning and whatever other metadata is relevant can be done by adding that data to the response. Swagger documentation will be present necessarily for integration into RESTBase; see the pull request or the service's spec for how that will look.

Change 369446 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[operations/puppet@production] RESTBase: Add the Recommendation API URI

https://gerrit.wikimedia.org/r/369446

PR #849 exposes publicly the service's end point.

Change 369568 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[mediawiki/services/restbase/deploy@master] Config: Add the Recommendation API service's URI

https://gerrit.wikimedia.org/r/369568

@DarTar the pull request has been updated if you wanted to weigh in on the wording: https://github.com/wikimedia/restbase/pull/849

Change 369684 had a related patch set uploaded (by Nschaaf; owner: Nschaaf):
[mediawiki/services/recommendation-api@master] Return list of 'items' and make domain dynamic

https://gerrit.wikimedia.org/r/369684

Change 369684 merged by Mobrovac:
[mediawiki/services/recommendation-api@master] Return list of 'items' and make domain dynamic

https://gerrit.wikimedia.org/r/369684

@DarTar @schana I reviewed the public end point and Pchelolo's comments. We should stay away from "translation" as a term for the kind of service we're offering. We're offering articles for "creation" and sometimes this involves translating an article, sometimes creating it from scratch.

@Pchelolo has a comment about using "item" instead of "article" for consistency. I'm not sure if this can work, since creating item as an article can be confused with creating an item on Wikidata.

@leila I think framing it as articles for creation would be confusing since it requires a pair of source and target languages. If a monolingual person is using it to find articles to create, this endpoint will be unable to help them. With respect to the "items" versus "articles," lists of articles are consistently returned under the "items" label throughout RESTBase. We should stay consistent with this established practice. The endpoint's documentation specifies that it is returning a list of articles.

Change 369446 merged by Filippo Giunchedi:
[operations/puppet@production] RESTBase: Add the Recommendation API URI

https://gerrit.wikimedia.org/r/369446

@leila I think framing it as articles for creation would be confusing since it requires a pair of source and target languages. If a monolingual person is using it to find articles to create, this endpoint will be unable to help them.

The point of this service is to help people find what articles to create. translation is one way of doing that. This aligns nicely with future plans for expanding the scope of the API on our end, but also with messaging with regard to what can be done with the API when we communicate it with tool developers. We want people to focus on creation and not translation specifically when they interact and use the API.

With respect to the "items" versus "articles," lists of articles are consistently returned under the "items" label throughout RESTBase. We should stay consistent with this established practice. The endpoint's documentation specifies that it is returning a list of articles.

the collision with Wikidata name system will get back to us down the line, but that's something that we can change across the RESTBase system when/if happens. skipping this.

Change 369568 merged by Mobrovac:
[mediawiki/services/restbase/deploy@master] Config: Add the Recommendation API service's URI

https://gerrit.wikimedia.org/r/369568

Mentioned in SAL (#wikimedia-operations) [2017-08-03T18:15:07Z] <mobrovac@tin> Started deploy [restbase/deploy@65af18d]: Expose the recommendation API publicly and activate hiwikiversity - T170877 T168765

Mentioned in SAL (#wikimedia-operations) [2017-08-03T18:23:40Z] <mobrovac@tin> Finished deploy [restbase/deploy@65af18d]: Expose the recommendation API publicly and activate hiwikiversity - T170877 T168765 (duration: 08m 33s)

While the public end point is now live, it is marked as unstable, which means we are (still) free to make changes to it. For posterity, the Research team is concerned with the the translation path segment (as indicated by @leila in the comments above). Quoting @DarTar from the email:

[...] the team feels we should use the term "article-creation" instead of "translation" in the endpoint, if possible.

This is for a number of reasons

  • while the API does require a pair of languages, it was not designed (and it's not going to be used uniquely) for translations. In fact, many community members rely on it to identify gaps and to create corresponding articles in the target language, regardless of the actual content of the source.
  • this may also facilitate the future productization of APIs we're currently designing and testing (such as article-expansion).
  • we don't exclude the possibility of recommender systems that apply to other types of resources (e.g. sections, wikidata items, commons files), so making sure the endpoint is unambiguous is also important. On this note, Nathaniel felt that having the endpoint structured as a verb+resource name would be more sensible (e.g. "create-article", "expand-article", "create-session"), but I defer to you on best practices across RestBASE.

I think it makes sense to move away from the term translation even though currently the API is about translations (in the sense that machine translation still needs human intervention). Thank you for thinking long-term and taking into account future expansions of the API, that helps to guide the discussion a lot.

In that context, the usual RB way is to use a combination of nouns and verbs, but use them as separate path elements. In this concrete case, I would propose to go with https://{domain}/api/rest_v1/data/recommendation/article/create/{from_lang}{/seed_article}. While article/create might suggest like an actual resource creation will take place, that is defeated by the end point's method (GET). Going with such an initial outline allows for easy generalisation in the form https://{domain}/api/rest_v1/data/recommendation/{resource_type}/{recommendation_action}/{<other_params>}, where resource_type can take the values article, section, session, etc., and recommendation_action can be create, expand, etc. I think it provides enough flexibility and room for future expansions. What do you think @DarTar @leila @schana ?

the collision with Wikidata name system will get back to us down the line, but that's something that we can change across the RESTBase system when/if happens. skipping this.

I would like to hear your arguments for this opinion as I am not familiar enough with your future plans for wikidata items to neither contradict nor confirm your doubts. What I can say at this point is that having a uniform way of exposing information and content publicly not only makes things easier for consumers (as they don't have to wonder what the name of a list is depending on the end point), but it also helps in minimising errors on our side (missed corrections, wrong definitions, etc).

While article/create might suggest like an actual resource creation will take place, that is defeated by the end point's method (GET).

Using nouns instead of verbs for resources should avoid the suggestion of a "create" action taking place. Example: https://{domain}/api/rest_v1/data/recommendation/article/creation/{from_lang}{/seed_article}

This is also more in line with the REST philosophy of using nouns for resources, and limiting the verb vocabulary to a few standard verbs (GET, PUT, POST, DELETE etc).

@mobrovac thanks for looking into this with us. Your recommendation looks good to me. And I'm fine with replacing "create" with "creation" per GWicke's last comment. On our end, and as far as I can see, that doesn't create a limitation, and if it makes things more consistent on your end, let's go with it. I'd wait for @schana and @DarTar to chime in as well, just to make sure everyone is on board. Thanks.

@leila @mobrovac @GWicke the point about verbs is well taken and I really like the proposed schema:

https://{domain}/api/rest_v1/data/recommendation/{resource_type}/{recommendation_action}/{<other_params>}

+1 on my end. Thanks, folks.

https://{domain}/api/rest_v1/data/recommendation/article/creation/{from_lang}{/seed_article}

I think this looks good, but I think we may have to rethink it or add another qualifier when there are different types of article creation recommendations.

https://{domain}/api/rest_v1/data/recommendation/article/creation/{from_lang}{/seed_article}

I think this looks good, but I think we may have to rethink it or add another qualifier when there are different types of article creation recommendations.

Could you expand on this, please? What do you propose?

@leila @schana please note the swagger description of the API still says "Recommends articles to be translated from the source to the domain language.", "the list of articles recommended for translation". This will need to be updated along with the endpoint path.

@leila @schana please note the swagger description of the API still says "Recommends articles to be translated from the source to the domain language.", "the list of articles recommended for translation". This will need to be updated along with the endpoint path.

+1. FWIW, I'm still waiting on @schana to respond before putting up a PR and changing the route officially.

+1. FWIW, I'm still waiting on @schana to respond before putting up a PR and changing the route officially.

Since there are different ways that article creation recommendations can be used (translation, finding content coverage gaps, etc.), there may be different algorithms used to surface these recommendations in the future. Hiding the specific implementation behind an encompassing endpoint will make it less clear to know how the results are being generated.

  • Example - how do we distinguish these endpoints and inform the user of the implementation details?:
    • specific to translation:
      • /article/creation/{from_lang}{/seed_article}
    • specific to content gap discovery:
      • /article/creation{/seed_article}
    • specific to some other type:
      • /article/creation/{foo}{/bar}

This isn't an issue now since there's only the one type, but there may be more in the future.

OK, I see. In light of this, it seems to me it would be best to introduce the recommendation type after all since there is no way for the RB router to distinguish /{from_lang}{/seed_article} and {/seed_article} (from the second example). And even if there were, the routes are ambiguous and not clear.

So, what do folks say about https://{domain}/api/rest_v1/data/recommendation/article/creation/{recommendation_type}{<rec_arguments>} ? So the first end point to go out would be https://{domain}/api/rest_v1/data/recommendation/article/creation/translation/{from_lang}{/seed_article}.

This isn't an issue now since there's only the one type, but there may be more in the future.

If we know they will (or might) be new types, ignoring that fact now will cause us a lot of pain in the future.

@mobrovac I think the endpoint you suggested (https://{domain}/api/rest_v1/data/recommendation/article/creation/translation/{from_lang}{/seed_article}) looks good for the translation recommender. It also looks good for cases where we may want to recommend similar articles (https://{domain}/api/rest_v1/data/recommendation/article/creation/related/{seed_article}).

We also have cases where we recommend only sections (rather than articles) for creation. In that case, we'd be swapping the /article/creation/ part with /section/creation/, right? If this change makes sense, then I think we should go with your suggestion and modify the end point to suit specific types of recommendations.

Thank you @bmansurov for the confirmation. PR #939 addresses this. You should expect it to be live early next week.

We also have cases where we recommend only sections (rather than articles) for creation. In that case, we'd be swapping the /article/creation/ part with /section/creation/, right?

Correct. Having the exact type spelled out in the API route allows us to easily create new ones for all possible types.

Mentioned in SAL (#wikimedia-operations) [2018-01-10T12:55:37Z] <mobrovac@tin> Started deploy [restbase/deploy@a2aabfb]: API: add top-by-country, change recommendation route, fix duplicates in onthisday - T181520 T170877 T175974

Mentioned in SAL (#wikimedia-operations) [2018-01-10T13:03:37Z] <mobrovac@tin> Finished deploy [restbase/deploy@a2aabfb]: API: add top-by-country, change recommendation route, fix duplicates in onthisday - T181520 T170877 T175974 (duration: 08m 00s)

mobrovac edited projects, added Services (done); removed Services (doing).

Deployed, resolving.