Page MenuHomePhabricator

API developer creates automated documentation
Open, LowPublic

Description

"As an API developer, I want changes I make to API definitions in my code to be updated automatically in the documentation on the developer portal, so that it's always up-to-date with the latest interfaces in production."

Generation

The best documentation is generated from the code itself (either source or tests) or from as close to the code as possible.

Requirements

  • Versioning by major version
  • Support for localization

Minimum API docs

Information included per endpoint:

  • Group
  • Name
  • Description
  • Path
  • Method
  • Example request URL
  • Request parameters table
    • Parameter name
    • Required/optional
    • Example
    • Description
  • Headers (if available, including the name, example, and description)
  • Request body example (if required)
  • Request body schema (if required, including the name, type, required/optional, and a description)
  • Responses
    • Response code
    • Response body example (if applicable)
    • Response body schema (if applicable, including the name, type, required/optional, and a description)

Better API docs

Automation:

  • Docs are automatically generated, ensuring that they are always 100% accurate.

Output format

We’d like users to access the docs within MediaWiki. However, the reference docs should not be editable within MediaWiki since they’re being generated automatically.

References:

  • Docs for the Action API can be pulled into a wiki page per module with API:Help (for example: {{Api help|parse}}). This means that the page structure is maintained manually and the surrounding content is indexed by the wiki search.
  • In contrast, most API docs are served as a single long page with specialized navigation (example). This makes it easier for users to CTRL+F for what they need.

Event Timeline

This is a placeholder for discussion about how we'll generate reference documentation for the REST API. I'd probably put these rough requirements on this functionality:

  • URL template
  • Parameters: names, types, location (query, path, post, ...), required?
  • Available on the Web "somewhere"
  • i18n

As a stretch goal, I'd love to see about as much definition as we have in our user stories:

  • Status codes and meanings
  • Notable request headers
  • Notable response headers
  • Response type and structure (usually a JSON object with a particular schema)

I think there are two ways we could support this:

  • Online, like the Action API. Some kind of URL modification would give you an HTML document for that endpoint, like https://www.mediawiki.org/w/api.php?action=help&modules=block does for the block module in the Action API. It's a bit trickier with the REST API, but maybe inserting "doc" into the path somewhere (like for the endpoint '/v1/revision/12345 we could have /v1/doc/revision/12345 or /doc/v1/revision/12345 or /v1/revision/12345?doc) would work. It's a little tricky to say how to handle other HTTP methods besides GET.
  • Offline, a la the doxygen documentation. We could generate something automatically that goes to the Documentation Portal, or just to mediawiki.org or another location "for now".

Fire away!

@apaskulin if you have some ideas about this, please jump in!

Always happy to discuss this! Here are my notes:

Endpoint elements

Name
Example: Get page history

Description
Example: Returns information about the 20 latest revisions to a wiki page, starting with the latest revision. The returned revision segment includes API routes for the next oldest, next newest, and latest revisions, letting you scroll through page history. This endpoint responds to the presence a logged-in user and displays content appropriate to that user's permissions.

Path
Example: /page/{title}/history

Method
Example: GET

Request parameters (including the name, location, required/optional, an example, and a description)
Example: title, path parameter, required, My_Wiki_Page, The title of the wiki page being accessed.

Headers (including the name, example, and description)
Example: If-Modified-Since, If-Modified-Since: Wed, 21 Oct 2015 07:28:00 GMT, Returns the response only if the content has changed since the provided timestamp. Takes a timestamp in the format <day-name>, <day> <month> <year> <hour>:<minute>:<second> GMT.



Responses (including the status code, description, and an example or sandbox)
Example:
200 Success
{ “my json”: “blob”}
404 Not found
{ “my error”: “blob”}

Response schema (including the name, type, required/optional, and a description)
Example: latest, required, string, The API route to get the 20 latest revisions.

Generation methods

Ideally, endpoint docs would be generated from the code itself. For example, getting the path from coreRoutes.json or getting a response example by calling the API. When this isn't possible, the second best option is to generate information from code comments. For example, documentation for a request parameter could be generated from comments such as @parametername title @parameterstatus required etc.

Sandbox and examples

Ideally, the API would support a sandbox, similar to the Action API sandbox and the sandbox included with swagger-ui. This would, on a per endpoint basis, allow the user to input parameters, see the request, and get a live response from the server. We should think about how to handle sandboxes for write endpoints, since this could lead to accidental vandalism.

Availability

Ideally the docs would function similarly to the Action API docs: they’d be available as static pages and embeddable within a wiki page.

@apaskulin all good stuff! I think the sandbox is definitely a whole different task. It'd be great to have it, and have it integrated right into the documentation.

Another aspect I want to discuss is if we generate HTML or other documentation formats directly, or if we use an intermediate structured format.

Generating an OpenAPI definition means we can use any of dozens of tools for generating documentation and even client libraries from the definition. However, it's not very readable in its regular form.

I'd prefer generating OpenAPI.

Oh, and as @apaskulin mentioned, since having a sandbox would be great, generating an OpenAPI definition would allow us to use tools like the swagger-ui frontend.

eprodromou added a subscriber: Eevans.

We discussed this in our kickoff meeting today.

There was a lot of resistance to the idea of having an endpoint for OpenAPI 3.0 definitions of the (other) endpoints. I like the idea of using OpenAPI since there are a lot of other tools that would benefit, such as client code generators. The feeling in the room was that client code generation is unrealistic, and that the OpenAPI toolset is wonky.

We couldn't come to a decision on how to make it work otherwise. One suggestion was using the same mechanism as the Action API, but I am a hard no on generating any HTML in the API space (which is how Action API works). I think it's too hard to make it interleave with a RESTful API.

@Eevans suggested a mechanism using doc comments, which would probably be great. But we didn't know what the output would be or where it would go.

I'm moving this task back to the backlog until we can come to some consensus on what to do.

Nice find, @Pchelolo! It seems like the docs site for that library is down, but the issue pointed at at this web archive link to view the docs in the meantime. I think using a library like this would meet our requirements, especially if we can find a way to render the spec within MediaWiki.

We discussed this in our kickoff meeting today.

There was a lot of resistance to the idea of having an endpoint for OpenAPI 3.0 definitions of the (other) endpoints. I like the idea of using OpenAPI since there are a lot of other tools that would benefit, such as client code generators.

Just to be clear, are there tools you have in mind other than code generators?

The feeling in the room was that client code generation is unrealistic, and that the OpenAPI toolset is wonky.

You can certainly generate stubs, but as a developer I probably wouldn't. I doubt I could be bothered to setup the tools just to generate stubs; It just doesn't provide value. When I looked at this before, I could find no evidence of OpenAPI/Swagger users actually doing this in practice.

We couldn't come to a decision on how to make it work otherwise. One suggestion was using the same mechanism as the Action API, but I am a hard no on generating any HTML in the API space (which is how Action API works). I think it's too hard to make it interleave with a RESTful API.

Could you expound on this? Doesn't every (proposed) method require turning something machine-readable into HTML?

@Eevans suggested a mechanism using doc comments, which would probably be great. But we didn't know what the output would be or where it would go.

If the idea is that you'd view the documentation in a web browser, then ultimately it needs to be turned into HTML somehow, and served.

I'm moving this task back to the backlog until we can come to some consensus on what to do.


To be clear: My understanding (taken from the text of the user story) is that we have published docs current for the API, that are maintained by the devs. OpenAPI is one way to do that, but has been discussed elsewhere, it's has it's problems (and to be fair, it's meant to solve a broader set of problems). I agree with @apaskulin, in a perfect world the documentation would be generated directly from the code that implements the API, but I don't think that'll work here (in part, because we're talking about an arbitrary number of code bases written in arbitrary languages. The next-best alternative IMO is one that is as simple, and code-adjacent as possible.

TL;DR If the requirement is simply to go from "something machine-readable" to HTML, then there are almost certainly alternatives to OpenAPI

There was a lot of resistance to the idea of having an endpoint for OpenAPI 3.0 definitions of the (other) endpoints.

There was NOT a lot of resistance to go with OpenAPI. There was resistance to rush into implementing anything without firm evaluation process.

We need to:
a) Establish requirements. These are just a couple of questions from the top of my head, I'm sure I'm forgetting a bunch of important ones.

  • what's the primary use-case?
  • What other use-cases do we have in mind right now?
  • How often would we update the public facing docs? One per release or on every commit?
  • Do we want to have archive of older, per API version release docs available to the public?
  • Do we want to have an executable snippet inside a documentation?
  • Do we want to have a big page with all the APIs like the one RESTBase provides?
  • From the presentation of the developer portal I've heard that it would eventually be open to public for editing. If so, how do we want the autogenerated docs to work with edits on that wiki? Is the documentation editing for developers only, or do we want to integrate our tech writer and community into it?
  • What's the i18n requirements? Do we want to have multilingual documentation? If so, do we want to integrate with translatewiki and standard mechanisms for translations?
  • What do we want on the doc? @apaskulin has mostly answered this one though.

b) Go shop around for existing options and frameworks and make a decision matrix, estimating the capabilities of each potential option with each requirement.

c) Make a choice and start implementing it.

I'm not saying we need to implement all the features right away. But we at least need to look a little bit into the future and avoid selecting a technology just to through it away as soon as we realize some critical requirement can not be satisfied with it.

@Pchelolo beat me to commenting so I'll truncate mine. I would also strongly suggest doing a requirements matrix.

Two concerns I have are:

  1. I would be slightly concerned about annotations being closer to code but adding complexity because the docs for a single endpoints are potentially split over multiple files. I don't know if this would end up any easier than maintaining the json/yaml directly.
  2. This user story doesn't technically cover where the docs are intended to be published but I think that should be a consideration, if the developer portal should be able to wrap or redirect to the docs UI, the existence of an easy to generate UI is likely a requirement.

I'm happy to take on building out the matrix and the requirements and then we can review from there. I'll assign myself the ticket unless @eprodromou
strenuously objects?

A couple more thoughts:

  • As a potential use-case, we could think about how to integrate it with api-testing tool, or how to make automated tests of the documentation and if we want that. t would be pretty cool, and at least an easy variant of tests for documentation could be achieved. For example, if we decide on OpenAPI as a medium for specifications, we can require that there's an 'example' for each of the parameters and autogenerate requests and execute them within tests.

Just to be clear, are there tools you have in mind other than code generators?

IDE integration is another use for OpenAPI that I know of.

We couldn't come to a decision on how to make it work otherwise. One suggestion was using the same mechanism as the Action API, but I am a hard no on generating any HTML in the API space (which is how Action API works). I think it's too hard to make it interleave with a RESTful API.

Could you expound on this? Doesn't every (proposed) method require turning something machine-readable into HTML?

Yes. My main concern is that the Action API has a special method for generating HTML output. That's probably not too bad for an RPC-style API. I think interleaving HTML into our REST API is going to be more complicated, and I'd rather there was a clear distinction between the functional REST API URLs and the documentation.

In general, I would like people to be looking in one place -- the developer portal -- for API documentation. Not trying to cut-and-paste the REST API URLs in their browsers to get the right combination to show HTML rather than JSON.

@Eevans suggested a mechanism using doc comments, which would probably be great. But we didn't know what the output would be or where it would go.

If the idea is that you'd view the documentation in a web browser, then ultimately it needs to be turned into HTML somehow, and served.

Agreed!

TL;DR If the requirement is simply to go from "something machine-readable" to HTML, then there are almost certainly alternatives to OpenAPI

I'm fine with that. I've broken out the requirement for a machine-readable API definition to T239752: Client Developer downloads machine-readable definition of the API. That should decouple us from getting documentation done. We can use whatever techniques make the most sense to get documentation out.

I've updated the task description with the requirements as I see them. Feel free to update as needed!

  1. This user story doesn't technically cover where the docs are intended to be published

Good point! I updated the user story to say that it should be on the developer portal.

I'm happy to take on building out the matrix and the requirements and then we can review from there. I'll assign myself the ticket unless @eprodromou
strenuously objects?

IMHO implementation options should be up to the tech lead, but you and @BPirkle can work that out!

Here would be my main hopes:

  • As automated as possible to minimize people forgetting to generate docs; fitting it into the CI process would be 💋👌 chef's kiss
  • Output that fits into the look-and-feel of the developer portal; wikitext is probably best here, but embedding options may work
  • Translated or translatable
  • Satisfying T239752: Client Developer downloads machine-readable definition of the API at the same time would be nice but not required
  • We don't need to deal with incompatible API versions for a while

One last thing: it would be really cool if we dogfood the Core REST API for updating pages on the developer portal.

To clarify: The Core REST API needs to be documented on mediawiki.org, but it will not be included in the dev portal. The Unified Wikimedia API needs to be documented on the dev portal but not on mediawiki.org. My assumption was that the automated API doc solution would be reusable for both these APIs.

To clarify: The Core REST API needs to be documented on mediawiki.org, but it will not be included in the dev portal. The Unified Wikimedia API needs to be documented on the dev portal but not on mediawiki.org. My assumption was that the automated API doc solution would be reusable for both these APIs.

Great point.

Here are my notes on using swagger-php to generate API docs from code annotations, as well as notes on the apiDoc tool.

Adding missing MediaWiki-REST-API code project tag as Core Platform Team Initiatives (MW REST API in PHP) team tag is archived and its parent Platform Engineering team does not exist anymore