Associated hypothesis
This work supports a FY24-25 Q2 hypothesis associated with KR5.1 (NOTE: Specific hypothesis number to be updated later, following Q2 planning):
Description
We have started generating OpenAPI specifications for the MediaWiki REST API. These specs will be used to streamline documentation management by reducing the dependency on humans to manually update the documentation when code changes occur. Utilizing industry standard OpenAPI definitions also unlocks new developer experiences (eg: SwaggerUI interactive documentation portal) powered and enables us to perform additional validation of our public documentation against actual code behavior so that we: a) know that the documentation accurately reflects the actual behavior of the endpoint, and b) reduce the risk of accidental breaking changes by applying API spec validation as part of the CI process.
This body of work will move us towards completing the spec definition and surfacing the SwaggerUI interactive documentation across Wikimedia projects. Currently, the REST API Sandbox page is only available on the test wiki site. Although we are starting with the MediaWiki REST APIs explicitly, we would also like to establish patterns for other teams to follow when creating and managing APIs.
Problem statement
Before we can fully realize the value of OpenAPI definitions, we need to have a mechanism for including and validating the responses in the documentation. Responses are not included in our generated specs, because they are not currently structured in code, in a reusable way. We will fill this gap by defining the expected responses as JSON objects, which will exist adjacent to the generated specs and can be referenced by them. By defining the expected responses as JSON objects instead of as PHP arrays in code, we can also ensure that the specs are more easily understood and can more easily be referenced in other areas of the documentation or interactive experiences.
In addition, we do not currently have a mechanism that automatically validates that the documentation reflects reality. This makes our APIs more difficult to use, because breaking changes may accidentally be introduced, and/or documentation updates may lag behind changes in behavior, leading to a confusing and frustrating experience for developers adopting the MediaWiki REST API. Once we have reusable object models and expected responses, we can then build processes into the CI deployment pipelines to automatically capture and flag discrepancies between the documented responses and real world endpoint behaviors. This will improve the stability of our APIs and improve the developer experience.
_WE5.1.X_: If we represent all content module endpoint responses (10 in total) in our MediaWiki REST API OpenAPI spec definitions, we will be able to implement programmatic validation to guarantee that our generated documentation matches the actual responses returned in code.
- Conditions of acceptance
- Define reusable objects and endpoint response bodies as JSON schemas.
- Update generated OpenAPI schemas to reference the object and response bodies as inline definitions.
- Create reusable code for fetching the response specification, for validation against the actual response body in the test framework.
- Create reusable code for validating the specification against the actual response body. This will most likely utilize an off the shelf JavaScript based schema validator.
- For each endpoint, add the validation check to the relevant end-to-end endpoint tests in Mocha.
- Generate a discovery document that provides a directory of all modules currently installed on a given MediaWiki instance.
- Publish a canonical URL for the discovery document schema.
- [Stretch] Update the Sandbox to reference the discovery document to reference extensions and external services that should be listed.
- [Stretch] Add MediaWiki REST API schemas to schema.wikimedia.org
Out of scope
This work only captures response body definition and validation. This means that changes in input (eg: new query parameter option) will not be caught. This is by design; we will have a separate, follow up item that will allow us to automatically detect and respond to input changes that alter the behavior of the endpoint.