Page MenuHomePhabricator

The Action API should allow input in JSON
Closed, DuplicatePublic

Description

The API takes its input through parameters provided by the query string or from a POST entity in application/x-www-form-urlencoded or multipart/form-data format.

https://www.mediawiki.org/w/index.php?title=API:Data_formats&oldid=2956458

It would be helpful if you could POST the parameters with a body that is application/json. It also would be be helpful if, like the existing application/x-www-form-urlencoded, you could mix/match params in the URL.
example:

curl --request POST \
  --url 'http://localhost:8888/api.php?action=query&format=json' \
  --header 'content-type: application/json' \
  --data '{
	"titles": "Gotham"
}'

Event Timeline

dbarratt created this task.Nov 21 2018, 7:29 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 21 2018, 7:29 PM
Anomie added a subscriber: Anomie.

You say "it would be helpful", but you don't say what it would actually help. It seems to me that it would be as likely to complicate and confuse things.

The code review on https://gerrit.wikimedia.org/r/c/mediawiki/core/+/388486 seems relevant to this request.

Also potentially relevant is some of the discussion on T187603: Add JSON parameter type to the action API.

This comment was removed by dbarratt.

You say "it would be helpful", but you don't say what it would actually help. It seems to me that it would be as likely to complicate and confuse things.

For some background, I was working on T209133 and I initially created a special page (and a non-localized custom route). But then I was thinking it might be better if it was a module in the action API (which would allow different return formats, etc.). But the GraphQL IDEs (that I know of) expect to be able to make a POST in JSON (though, as far as I know, this is not part of the GraphQL spec). Anyways, it would be best if you could specify the route to the Action API like: /api.php?action=graphql and then specify the query in JSON body. The work-around is to do what I'm doing and make a custom special page (which has other downsides, but maybe that's the proper way to do it?). Alternatively, I could just reject the de-facto standard and not provide compatibility with GraphQL IDEs.

Anyways, feels like there are probably more use-cases than the one I have, so I abstracted the description.

Anomie added a comment.EditedNov 21 2018, 9:30 PM

I'm not too familiar with GraphQL. Ignoring details about format for a moment, what parameters would a GraphQL endpoint need as input, and what pieces of data would it output?

Glancing through https://graphql.org/, it seems like it would take one input field for the query, and outputs a blob of structured data (that could be represented in any of the API's output formats). Is that right?

If that's about right, what would these IDEs want to post as their JSON?

dbarratt added a comment.EditedNov 21 2018, 9:58 PM

I'm not too familiar with GraphQL. Ignoring details about format for a moment, what parameters would a GraphQL endpoint need as input, and what pieces of data would it output?
Glancing through https://graphql.org/, it seems like it would take one input field for the query, and outputs a blob of structured data (that could be represented in any of the API's output formats). Is that right?

The input would be three parameters:

  1. query which is a GraphQL string,
  2. variables which is dictionary of parameters (typically in JSON, but could be anything)
  3. operationName witch is string

You are correct on the output, it is a blob of structured data that can be represented in any format (although, most people use JSON as it is the closest representation to the query itself, but that is not a requirement).

If that's about right, what would these IDEs want to post as their JSON?

It would look like this (from Insomnia, as an example):

curl --request POST \
  --url http://localhost:8888/graphql \
  --header 'accept-language: ja' \
  --header 'content-type: application/json' \
  --data '{"query":"query getTitle($title: String) {\n  page(title: $title) {\n    title\n  }\n}","variables":{"title":"Gotham"},"operationName":"getTitle"}'

here is the POST body for clarity:

{
  "query": "query getTitle($title: String) {\n  page(title: $title) {\n    title\n  }\n}",
  "variables": {
    "title": "Gotham"
  },
  "operationName": "getTitle"
}

What do you think? Do you think it would be appropriate to make it an API module (with three params) and mark this as a "nice to have" or would it be better to make it a special page (and a non-localized route) like I have now?

It seems like it would be better for it to be an API module overall, but there's a few small issues, but I admit, those could just be resolved in the future, there's no reason it wouldn't work as-is right now.

Tell me more about the 'variables' field. Is it always a map of scalar values that get substituted into the query, or can it have values that are arrays of scalars, or can the values themselves be basically anything representable in JSON?

I'm trying to get an idea of the requirements so I can try to make a good recommendation as to how this would best fit into the way the API does things.

Tell me more about the 'variables' field. Is it always a map of scalar values that get substituted into the query, or can it have values that are arrays of scalars, or can the values themselves be basically anything representable in JSON?
I'm trying to get an idea of the requirements so I can try to make a good recommendation as to how this would best fit into the way the API does things.

It's a key/value store that could be anything (scalars, arrays, objects, array of objects, etc.) basically anything you can represent in JSON (which is probably why most people use JSON, but I don't think that's a requirement.

I should also mention that the variable "keys" are defined in the user's query, so they cannot be predetermined.

To summarize the choices (that I can think of):

  1. Make the GraphQL Server an Action API module (i.e. action=graphql) with 3 parameters. This would not break any of the spec, but would have some changes from the de facto. We could resolve T210045 to allow an in-browser IDE within the ApiSandbox and T210107 could be resolved which would bring us more inline with the de facto standards. Having the GraphQL server available at /graphql is another de facto standard, but every IDE I've seen has allowed the URL to be customized, so I wouldn't worry about that.
  2. Make the GraphQL Server a Special Page (i.e. Special:GraphQL) with a non-localized route of /graphql. This can meet all of the de facto standards. The special page could use content-negotiation to display an in-browser IDE for users who request text/html (but the default would be a JSON response). This has obvious drawbacks though (no changing format, global tokens, specifying the CORS policy, etc.)

It's a key/value store that could be anything (scalars, arrays, objects, array of objects, etc.) basically anything you can represent in JSON (which is probably why most people use JSON, but I don't think that's a requirement.

Ick. In that case then there's probably little point in not using JSON for it.

If allowed values were just scalars or arrays of scalars, we could have used templated parameters for it. "variables=foo|bar|baz&var-foo=123&var-bar=Some+string&var-baz=a|b|c|d".


So, if we were going to implement this as an API module, I'd recommend:

  • The module should have three parameters:
    • query would be text-type. I wouldn't wait on T210045 to be able to make it more fancy.
    • variables would be text-type, or we could resolve T187603 and let it be explicitly JSON-type.
    • operationName would be a string.
  • We could also support that de-facto standard in the API module. It wouldn't be recommended or supported by ApiSandbox.
    • Do what Gergő recommended in code review on I8285ddf5: add a "getJson()" method to WebRequest that fetches the body as a JSON object when the post's content type is appropriate.
    • Add a fourth parameter, jsonpost, as a boolean indicating that the API will expect the POST body to be JSON data.
      • If the jsonpost parameter is given, then the API should require that $this->extractRequestParams() does not contain the other three parameters. And it should require that $this->getRequest()->getJson() returns an object with the three keys.
      • If the jsonpost parameter is not given, then the API should require that $this->extractRequestParams() does contain the other three parameters. And it should probably raise an error if the POST content type is JSON.
    • The endpoint used for the de-facto standard would therefore probably be /w/api.php?action=graphql&format=json&formatversion=2&jsonpost=1 (or any other value for jsonpost would work too).

As for this specific task, I'm inclined to decline it in favor of what's described above for the GraphQL module special case.

So, if we were going to implement this as an API module, I'd recommend:

Would you recommend using an API module or a special page? (see T210107#4767038). I can see the pros & cons to both.

  • Do what Gergő recommended in code review on I8285ddf5: add a "getJson()" method to WebRequest that fetches the body as a JSON object when the post's content type is appropriate.

That is what I would do. I think it should use content negotion to determine if the Content-Type is application/json and if so, merge the object (assume it is one) with the existing POST/GET vars (after processing those). If it's not an object, I think it should put it in some sort of generic var (i.e. data ?)

As for this specific task, I'm inclined to decline it in favor of what's described above for the GraphQL module special case.

Sure. Do we need a task for https://gerrit.wikimedia.org/r/c/mediawiki/core/+/388486 ? If so, feel free to take this one over, or close and create a new one for it. :)

Anomie added a comment.EditedNov 30 2018, 7:32 PM

Would you recommend using an API module or a special page? (see T210107#4767038). I can see the pros & cons to both.

I'm not sure, I see pros and cons both ways too. For example, a special page could provide something more specific to the task than might be reasonable in ApiSandbox, but couldn't be as easily used with existing libraries that provide action API access since it would be a different endpoint.

I think it should use content negotion to determine if the Content-Type is application/json and if so, merge the object (assume it is one) with the existing POST/GET vars (after processing those).

I don't like that idea. application/x-www-form-urlencoded can't really do data types other than string[1] and (kind of[2]) arrays, and multipart/form-data isn't much better.[3] For example, would you make JSON booleans work with ->getCheck() and not be able to differentiate "false" and "omitted", or only ->getBool()? What would you do about JSON null?

The benefit of WebRequest::getJson() is that it avoids all the mess and potential hackiness of trying to munge JSON into something resembling what all the existing code expecting traditional form GETs/POSTs is prepared to deal with.

[1]: Although PHP's automatic type munging lets you pretend you can get numbers and bools too, with some caveats.
[2]: Using the PHP interpretation of field names like "foo[bar][]".
[3]: You'd need to define content types other than the default text/plain, and have clients actually use them. Anything actually doing that sort of thing isn't likely to be using WebRequest::getVal().

If it's not an object, I think it should put it in some sort of generic var (i.e. data ?)

Again, I think it would be better to do without that sort of hack.

Sure. Do we need a task for https://gerrit.wikimedia.org/r/c/mediawiki/core/+/388486 ? If so, feel free to take this one over, or close and create a new one for it. :)

Either one works. Personally I'd lean towards closing this and filing a fresh task, but that's only because the fresh task wouldn't have to carry around all the discussion we had above about GraphQL to get to this point.