Page MenuHomePhabricator

Add linkrecommendation documentation to api.wikimedia.org
Closed, ResolvedPublic

Description

The Add-Link service (linkrecommendation service) has two endpoints that would be nice to document on api.wikimedia.org, as it's available through the gateway.

Some questions:

  • How can we add the documentation to the gateway?
  • After the initial documentation is added, what's the process to keep the documentation up to date as the endpoints may evolve over time?
  • Our application has auto generated API documentation using Swagger, should this be accessible via the api-gateway?

Event Timeline

Hi @kostajh, Thanks for opening this task!

Unfortunately, the API Portal doesn't integrate with Swagger at this time. So in order to add the linkrecommendation endpoints to the Portal, we'll need to manually create API docs in wikitext. I'm happy to help with this.

For updating the docs, this would again be a manual process. You could either make the updates yourself directly in the Portal (I've added you to the editors group), or you can open a task in Phabricator and I can do it. We're definitely looking into ways we can automate this in the future, but for now these workflows are all manual.

I'm seeing that the endpoint in the Gateway is: https://api.wikimedia.org/service/linkrecommendation/v0/linkrecommendations/{wiki_id}/{page_title}. (For example: https://api.wikimedia.org/service/linkrecommendation/v0/linkrecommendations/cswiki/Lipsko) Is this correct? (I see there's an open issue for the 504 error.) Is this for Wikipedia only? How is the the wiki_id parameter formed for non-Wikipedias (example: cswiktionary)?

Hi @kostajh, Thanks for opening this task!

Unfortunately, the API Portal doesn't integrate with Swagger at this time. So in order to add the linkrecommendation endpoints to the Portal, we'll need to manually create API docs in wikitext. I'm happy to help with this.

For updating the docs, this would again be a manual process. You could either make the updates yourself directly in the Portal (I've added you to the editors group), or you can open a task in Phabricator and I can do it. We're definitely looking into ways we can automate this in the future, but for now these workflows are all manual.

Thanks for the info @apaskulin! I'll have a go at adding the docs.

I'm seeing that the endpoint in the Gateway is: https://api.wikimedia.org/service/linkrecommendation/v0/linkrecommendations/{wiki_id}/{page_title}. (For example: https://api.wikimedia.org/service/linkrecommendation/v0/linkrecommendations/cswiki/Lipsko) Is this correct? (I see there's an open issue for the 504 error.) Is this for Wikipedia only? How is the the wiki_id parameter formed for non-Wikipedias (example: cswiktionary)?

Yes, that's right. The service is currently for Wikipedias only although in the future it may work for other wikis (e.g. mediawiki.org, wikitech, wiktionaries, etc), and in that case you'd use the non-Wikipedia wiki ID in the URL, for example https://api.wikimedia.org/service/linkrecommendation/v0/linkrecommendations/cswiktionary/těrcha

Do you have a preference for a different URL scheme we should use?

(I see there's an open issue for the 504 error.)

Yes, that issue is T276217: Use Envoy for making GET requests to lang.wikipedia.org/api.php, should be fixed soon.

I'll have a go at adding the docs.

Sounds good! I'm happy to step in and add the docs if needed, just let me know. This doc has some info that should be helpful.

Do you have a preference for a different URL scheme we should use?

The other APIs in the Portal use:

https://api.wikimedia.org/{namespace}/{version}/{project}/{language code}/...

# For example:
https://api.wikimedia.org/core/v1/wikipedia/en/page/Earth

My preference would be for:

https://api.wikimedia.org/linkrecommendation/v0/wikipedia/{language code}/linkrecommendations/{page_title}

# For example:
https://api.wikimedia.org/linkrecommendation/v0/wikipedia/cs/linkrecommendations/Lipsko

I'll have a go at adding the docs.

Sounds good! I'm happy to step in and add the docs if needed, just let me know. This doc has some info that should be helpful.

Actually, yes that would be great if you have time. I could help with editing after a first pass. If you don't have time that's OK as well, I can get to it later this week or next week.

Do you have a preference for a different URL scheme we should use?

The other APIs in the Portal use:

https://api.wikimedia.org/{namespace}/{version}/{project}/{language code}/...

# For example:
https://api.wikimedia.org/core/v1/wikipedia/en/page/Earth

My preference would be for:

https://api.wikimedia.org/linkrecommendation/v0/wikipedia/{language code}/linkrecommendations/{page_title}

# For example:
https://api.wikimedia.org/linkrecommendation/v0/wikipedia/cs/linkrecommendations/Lipsko

Thanks @apaskulin. One thing that's not clear to me is how this would work for testwiki (would the URL be /v0/wikipedia/test) or simplewiki (/v0/wikipedia/simple) as simple and test aren't really language codes. Also, what would a request for mediawiki.org or wikitech.wikimedia.org look like in this example? Would the project name be mediawiki and wikitech, and then the language code would always be en for wikitech (I don't think there are translations there) while for mediawiki you might ask for /v0/mediawiki/de/Growth to get https://www.mediawiki.org/wiki/Growth/de ?

Is that actually language code, or wiki domain? E.g. is no.wikipedia.org /wikipedia/no/ or /wikipedia/nb/?

Actually, yes that would be great if you have time. I could help with editing after a first pass. If you don't have time that's OK as well, I can get to it later this week or next week.

Sure thing! I'll start as soon as the 504 clears up.

One thing that's not clear to me is how this would work for testwiki (would the URL be /v0/wikipedia/test) or simplewiki (/v0/wikipedia/simple) as simple and test aren't really language codes.

That's correct: https://api.wikimedia.org/core/v1/wikipedia/simple/page/Earth and https://api.wikimedia.org/core/v1/wikipedia/test/page/Main_Page

Also, what would a request for mediawiki.org or wikitech.wikimedia.org look like in this example? Would the project name be mediawiki and wikitech, and then the language code would always be en for wikitech (I don't think there are translations there) while for mediawiki you might ask for /v0/mediawiki/de/Growth to get https://www.mediawiki.org/wiki/Growth/de ?

For mediawiki.org and wikitech, you need to omit the language code from the request (same for Commons and any other multilingual projects). For example: https://api.wikimedia.org/core/v1/mediawiki/page/Main_Page. For content with multiple languages on multilingual wikis, I don't believe there is any way to request that via the API, but you can get a specific language using the subpage, for example: https://api.wikimedia.org/core/v1/mediawiki/page/API%3AMain_page%2Fde

Is that actually language code, or wiki domain? E.g. is no.wikipedia.org /wikipedia/no/ or /wikipedia/nb/?

That's correct: It's more of a subdomain or project code. So yes, it would be /wikipedia/no/, but actually I just tried /wikipedia/nb/ and it redirects to the MediaWiki Core REST API. I didn't know about that behavior.

Is that actually language code, or wiki domain? E.g. is no.wikipedia.org /wikipedia/no/ or /wikipedia/nb/?

It's probably better named as wiki domain as a parameter. If we had an Accept-language header, I'd expect to pass a language code that way.
@apaskulin I wonder if we could update our URL schema to reflect https://api.wikimedia.org/{namespace}/{version}/{project}/{domain code}/...so that we don't exclude non-language-specific wikis?

Also, what would a request for mediawiki.org or wikitech.wikimedia.org look like in this example? Would the project name be mediawiki and wikitech, and then the language code would always be en for wikitech (I don't think there are translations there) while for mediawiki you might ask for /v0/mediawiki/de/Growth to get https://www.mediawiki.org/wiki/Growth/de ?

I am not certain, but I believe mediawiki domains are not supported for when requesting in this way.

Change 673484 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[research/mwaddlink@main] Output valid domains to query for on error

https://gerrit.wikimedia.org/r/673484

Change 673497 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[research/mwaddlink@main] [WIP] Modify URL structure for service

https://gerrit.wikimedia.org/r/673497

Change 673484 merged by jenkins-bot:
[research/mwaddlink@main] Output valid domains to query for on error

https://gerrit.wikimedia.org/r/673484

Change 673497 merged by jenkins-bot:
[research/mwaddlink@main] Modify URL structure for service

https://gerrit.wikimedia.org/r/673497

Actually, yes that would be great if you have time. I could help with editing after a first pass. If you don't have time that's OK as well, I can get to it later this week or next week.

Sure thing! I'll start as soon as the 504 clears up.

One thing that's not clear to me is how this would work for testwiki (would the URL be /v0/wikipedia/test) or simplewiki (/v0/wikipedia/simple) as simple and test aren't really language codes.

That's correct: https://api.wikimedia.org/core/v1/wikipedia/simple/page/Earth and https://api.wikimedia.org/core/v1/wikipedia/test/page/Main_Page

Also, what would a request for mediawiki.org or wikitech.wikimedia.org look like in this example? Would the project name be mediawiki and wikitech, and then the language code would always be en for wikitech (I don't think there are translations there) while for mediawiki you might ask for /v0/mediawiki/de/Growth to get https://www.mediawiki.org/wiki/Growth/de ?

For mediawiki.org and wikitech, you need to omit the language code from the request (same for Commons and any other multilingual projects). For example: https://api.wikimedia.org/core/v1/mediawiki/page/Main_Page. For content with multiple languages on multilingual wikis, I don't believe there is any way to request that via the API, but you can get a specific language using the subpage, for example: https://api.wikimedia.org/core/v1/mediawiki/page/API%3AMain_page%2Fde

Is that actually language code, or wiki domain? E.g. is no.wikipedia.org /wikipedia/no/ or /wikipedia/nb/?

That's correct: It's more of a subdomain or project code. So yes, it would be /wikipedia/no/, but actually I just tried /wikipedia/nb/ and it redirects to the MediaWiki Core REST API. I didn't know about that behavior.

@apaskulin The 504/503 issue is resolved. We're still working out some issues with the Swagger UI implementation, though. In the meantime, https://api.wikimedia.org/service/linkrecommendation/apispec_1.json documents the GET and POST endpoints for the service, including the path parameters, query parameters, and response types. Included also are bare bones example requests.

Realistically most people using this service would use the GET endpoint, so I think that would be the priority to document.

I made a start at the documentation in https://api.wikimedia.org/wiki/API_reference/Service/Link_Recommendation but haven't added any top level links. Could you please let me know which pieces of documentation you'd be able to add and what you'd like the Growth team to contribute?

Thanks!

kostajh added a project: Add-Link.
kostajh moved this task from Backlog to In progress on the Add-Link board.

Change 675196 had a related patch set uploaded (by Alex Paskulin; author: Alex Paskulin):
[research/mwaddlink@main] docs: Fix link to wikitext docs

https://gerrit.wikimedia.org/r/675196

Thanks for creating the wiki pages, @kostajh! I've reviewed and made a few edits. The docs are now at https://api.wikimedia.org/wiki/API_reference/Service/Link_recommendation (I've opened a patch to fix the link in the swagger spec.)

Based on the response from the API, I've listed the project and language support as:

  • wikipedia/ar
  • wikipedia/bn
  • wikipedia/cs
  • wikipedia/en
  • wikipedia/fr
  • wikipedia/simple
  • wikipedia/vi

To add the page for the POST endpoint, I'll need some extra information:

  • What is the behavior of the POST endpoint?
  • Does it require an OAuth 2.0 token? Does it support any other authentication methods?
  • Does it return a 200 on success? Does it return a response body?
  • Anything else we should note about this endpoint?

Change 675196 merged by jenkins-bot:
[research/mwaddlink@main] docs: Fix link to wikitext docs

https://gerrit.wikimedia.org/r/675196

Thanks for creating the wiki pages, @kostajh! I've reviewed and made a few edits. The docs are now at https://api.wikimedia.org/wiki/API_reference/Service/Link_recommendation (I've opened a patch to fix the link in the swagger spec.)

Merged, thanks! Will deploy it later this week most likely. In the meantime, I've added a redirect on-wiki.

Based on the response from the API, I've listed the project and language support as:

  • wikipedia/ar
  • wikipedia/bn
  • wikipedia/cs
  • wikipedia/en
  • wikipedia/fr
  • wikipedia/simple
  • wikipedia/vi

Yes, that looks correct

To add the page for the POST endpoint, I'll need some extra information:

  • What is the behavior of the POST endpoint?

It's identical to the GET endpoint but it's stateless and can operate on a specific revision of the text. The GET endpoint results in an API call to the MediaWiki API to get the wikitext, revision ID / page ID data that is used by the service to return results. With the POST endpoint, you need to supply all of that info (wikitext source, revision ID, and page ID) yourself.

In practice, when using a script that has ready access to the wikitext for a page you want to process, you'll probably use the endpoint, otherwise you can use the GET endpoint.

The current only known user of the POST endpoint is the refreshLinkRecommendations.php script that runs in local developer environments, beta cluster wikis, and production.

  • Does it require an OAuth 2.0 token? Does it support any other authentication methods?

It does not.

  • Does it return a 200 on success? Does it return a response body?

It returns the same output as the GET endpoint.

  • Anything else we should note about this endpoint?

With the above clarifications hopefully it's more clear, but please let me know if you have questions and I'll try to answer them. Thanks!

Thanks for creating the wiki pages, @kostajh! I've reviewed and made a few edits. The docs are now at https://api.wikimedia.org/wiki/API_reference/Service/Link_recommendation (I've opened a patch to fix the link in the swagger spec.)

Merged, thanks! Will deploy it later this week most likely. In the meantime, I've added a redirect on-wiki.

Oops, the redirect adds an extra menu item, I now realize why you deleted the page instead of adding a redirect

grafik.png (308×850 px, 71 KB)

@apaskulin do you mind deleting that page (again) as it looks like I don't have the rights to do that? Sorry about that!

Thanks for your responses, @kostajh! I've add the POST endpoint to the docs. If there's anything incorrect, feel free to edit. If everything looks good to you, I think this task can be resolved. (I deleted the redirect page; no worries! I should figure out a way to make the sidebar less picky...)

In my testing, I found that the POST endpoint does require an OAuth token. I believe this requirement is being applied at the API Gateway level. There's a similar issue being explored in T275571: Make short description API usable in WMF apps.

Thanks for your responses, @kostajh! I've add the POST endpoint to the docs. If there's anything incorrect, feel free to edit. If everything looks good to you, I think this task can be resolved. (I deleted the redirect page; no worries! I should figure out a way to make the sidebar less picky...)

Looks good. Thanks!

In my testing, I found that the POST endpoint does require an OAuth token. I believe this requirement is being applied at the API Gateway level. There's a similar issue being explored in T275571: Make short description API usable in WMF apps.

You're right, the POST endpoint does require an OAuth token, thanks for flagging that.