Page MenuHomePhabricator

Update Site module description in OpenAPI spec
Closed, ResolvedPublic1 Estimated Story Points

Description

Description

Access to the Sitemap endpoints was recently restricted (see: T407122: [5.2.5 Milestone] Introduce API Gateway access controls on sitemap endpoints). Although the Sitemap documentation is already hidden from the REST Sandbox since not all API users are eligible to access it, users may still discover the spec through the specs module and discovery endpoint. Because of this, we want to have a slightly more descriptive header for the API, which directs users towards resources with information about what types of users may access the endpoints, and how to get in contact to join our trusted bot program.

Conditions of acceptance

Update the description from:

Information about the site as a whole, such as sitemaps

To:

Provides information about Wikimedia project sites, including sitemaps. To prevent abusive scraping and ensure fair use of infrastructure, access to endpoints in this API is restricted to specific user groups. For more information about who can access this API, see https://wikitech.wikimedia.org/wiki/CDN/Backend_api/Sitemap_access or contact bot-traffic@wikimedia.org.

Implementation details

The Site module is not visible in the REST Sandbox. To see the generated spec, the easiest way is to access it through the Discovery endpoint within the Specs module (example in test wiki below, but the generated spec should be the same for all wikis):

NOTE: A similar message to this description is presented to users if/when a 403 error is returned:
image.png (1,308×1,016 px, 112 KB)

Event Timeline

HCoplin-WMF set the point value for this task to 1.

Change #1271691 had a related patch set uploaded (by KineticPelagic; author: KineticPelagic):

[mediawiki/core@master] sitemap: Clarify module description in OpenAPI spec

https://gerrit.wikimedia.org/r/1271691

The site module is part of core, which means that it is present on all third-party wikis. This means the suggested text is misleading and out of place those wikis.

The message in core should remain generic. If we really need to override it for Wikimedia, we can do so at other levels. I've never done, it, but I think that the WikimediaMessages extension can do this globally.

Fair point, @BPirkle .

Alternatively, we could just update the text itself to something more generic. I've taken a crack at it below:

Provides information about MediaWiki sites, including sitemaps.

To prevent abusive scraping and ensure fair use of infrastructure for Wikimedia Foundation hosted projects, access to endpoints in this API is restricted to specific user groups. For more information about who can access this API for Wikimedia projects, see https://wikitech.wikimedia.org/wiki/CDN/Backend_api/Sitemap_access or contact bot-traffic@wikimedia.org.

I still see your point where it's a little wonky to include the secondary message for all MediaWiki instances, but I also don't necessarily want to add a ton of complexity to override the message. If it's easy to use WikimediaMessages or if it's already done, cool; if it's going to be more than like a day, let's talk about it. I could see a future where we may want to have Wikimedia specific descriptions for other Core endpoints, in which case it might be a helpful reusable pattern. I'm struggling to come up with concrete examples right now though, which suggests it might not really be worth it?

Fair point, @BPirkle .

Alternatively, we could just update the text itself to something more generic. I've taken a crack at it below:

Provides information about MediaWiki sites, including sitemaps.

To prevent abusive scraping and ensure fair use of infrastructure for Wikimedia Foundation hosted projects, access to endpoints in this API is restricted to specific user groups. For more information about who can access this API for Wikimedia projects, see https://wikitech.wikimedia.org/wiki/CDN/Backend_api/Sitemap_access or contact bot-traffic@wikimedia.org.

I still see your point where it's a little wonky to include the secondary message for all MediaWiki instances, but I also don't necessarily want to add a ton of complexity to override the message. If it's easy to use WikimediaMessages or if it's already done, cool; if it's going to be more than like a day, let's talk about it. I could see a future where we may want to have Wikimedia specific descriptions for other Core endpoints, in which case it might be a helpful reusable pattern. I'm struggling to come up with concrete examples right now though, which suggests it might not really be worth it?

I remain concerned about including wikitech links and wikimedia emails in the generic message. I realize you qualified it, but I'm not sure people would pick up on that. You and I know what a "Wikimedia project" is, but someone just calling an API on a website might not - a lot of folks wouldn't even know who Wikimedia is.

I also feel like it's worth learning the override pattern (assuming, as you say, the effort level isn't unreasonable). As one example, I can imagine us wanting to send out more specific messages with core-generated 429s. The current message in core just says ""A rate limit was exceeded. Please try again later." Authentication failures might be another example. A glance at WikimediaMessages suggests it won't be all that hard. Hua Szu and I talked a bit about it late on Thursday, and I'm happy to help next week if needed.

Cool beans. And agreed -- if the implementation isn't crazy, that does seem like the more complete solution, and one that we could potentially take advantage of elsewhere.

Change #1275422 had a related patch set uploaded (by KineticPelagic; author: KineticPelagic):

[mediawiki/extensions/WikimediaMessages@master] sitemap: Override module description in OpenAPI spec for Wikimedia

https://gerrit.wikimedia.org/r/1275422

Thank you both for this discussion, which helps us to make a correct and clear solution for our users.

Please see the most recent Gerrit change.

Below was how I tested that the change achieved our desired results, including the expectation that WMF-specific behavior is isolated to the WikimediaMessages extension.

Test Plan Completed

Before these steps, I mounted the local extension code via docker-compose.override.yml.

1. Wikimedia wiki behavior

Setup:

  • Enabled WikimediaMessages extension in LocalSettings.php:

wfLoadExtension( 'WikimediaMessages' )

  • Rebuilt localisation cache:

php maintenance/rebuildLocalisationCache.php

Test:

  • Accessed REST endpoint:

/w/rest.php/specs/v0/module/site%2Fv1

Expected result:

  • "description" field contained Wikimedia-specific message, with access restriction and contact information text
  • Screenshot:
    T418195_sitemap_module_openapi_spec_wikimediamessages_wmf_wikis_overridden_description.png (3,836×2,106 px, 541 KB)

Result:
✔ Validated that override applied via WikimediaMessages


2. Third-party wiki behavior and regression check

Setup:

  • Disabled WikimediaMessages extension in LocalSettings.php:

# wfLoadExtension( 'WikimediaMessages' )

  • Rebuilt localisation cache again:

php maintenance/rebuildLocalisationCache.php

Test:

  • Accessed same REST endpoint:

/w/rest.php/specs/v0/module/site%2Fv1

Expected result:

  • "description" field value fell back to intentionally generic core MediaWiki message
  • Screenshot:
    T418195_sitemap_module_openapi_spec_wikimediamessages_third_party_preserved_description.png (3,840×2,098 px, 472 KB)

Result:
✔ Validated that we preserved core fallback behavior


Note: I updated the WikimediaMessages documentation along the way to accurately reflect what I had to do.

Excellent, and thanks for the documentation updates!

@HCoplin-WMF , do we want the link in the description to be clickable? Right now in the patch it isn't, but we've established precedent for markdown in these messages.

If you do want clickable, do you want the full link as visible text, like:
https://wikitech.wikimedia.org/wiki/CDN/Backend_api/Sitemap_access

Or just the page name, like:
CDN/Backend_api/Sitemap_access

Let's make it clickable and follow the same naming convention as the page. Please make the email address a mailto: link as well. Thanks for the due diligence here!

Let's make it clickable and follow the same naming convention as the page. Please make the email address a mailto: link as well. Thanks for the due diligence here!

Thank you, @HCoplin-WMF and @BPirkle ! I updated the Gerrit change.

Change #1275422 merged by jenkins-bot:

[mediawiki/extensions/WikimediaMessages@master] sitemap: Override module description in OpenAPI spec for Wikimedia

https://gerrit.wikimedia.org/r/1275422

Change #1271691 abandoned by KineticPelagic:

[mediawiki/core@master] sitemap: Clarify module description in OpenAPI spec

Reason:

We decided to use a different approach, as shown in Gerrit change https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikimediaMessages/+/1275422 - merged git hash 598e507f8fe8c740b1659a09641e600e9077619b.

https://gerrit.wikimedia.org/r/1271691