Page MenuHomePhabricator

Improve GET History Filter
Open, Needs TriagePublic

Description

Task T231597: Implement GET History Filter added a page history endpoint to the core REST API.

This code still needs several fixes and improvements:

Event Timeline

BPirkle created this task.Thu, Oct 10, 4:31 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptThu, Oct 10, 4:31 AM

Change 542006 had a related patch set uploaded (by BPirkle; owner: BPirkle):
[mediawiki/core@master] Improvements to Core REST API GET page history handler

https://gerrit.wikimedia.org/r/542006

Bill, if you're making changes, can you return a 200 status code when there are 0 results? I updated the user story to be more specific that the array is 0-20 elements.

I understand that some folks in the group think it should be a 404; I'd really like it to be a 200. I haven't been able to find any other public APIs that return 404s for empty lists. In an informal poll on Twitter, 89% of respondents expect empty results to come with a 200 status code.

Bill, if you're making changes, can you return a 200 status code when there are 0 results? I updated the user story to be more specific that the array is 0-20 elements.

@eprodromou There's 3 distinct cases where we return 404 now.

  • If there's no results for existing title and revision - I agree an empty list would be much better here.
  • If the title supplied in the parameters doesn't exist - I think this should stay 404
  • If the revision supplied in the parameters for paging does not exist - I think this should stay 404

A couple more ideas:

  • If the comment is empty - should we even bother adding a comment field?
  • If the user is anon, the user id will always be 0, and it's kinda an implementation detail - should we omit the user id for anons?

@Pchelolo I agree on the cases for 404s as you said. 👍

I see three options for "comment" if there is not one:

  • Leave out the property entirely
  • set it to null
  • set it to empty string "" (I think this is how it works now)

Of the three, I prefer a null value. For clients, it's kind of irritating to have properties blink in and out of existence. The empty string is not really a good way to indicate "does not exist".

I'm fine with null rather than zero for user IDs for anons. Clearer for me, too.

Change 542006 merged by jenkins-bot:
[mediawiki/core@master] Improvements to Core REST API GET page history handler

https://gerrit.wikimedia.org/r/542006

@BPirkle one last thing (sorry): the original requirements had "delta" and "size" in "characters". What I meant was, "do it exactly like the Web UI does it". Reviewing, the size and delta are in bytes in the Web UI, and I think we should do the same here. (I tried adding multi-byte chars to pages and it looks like it's definitely a difference in bytes).

I don't know if you did it that way in the first place, but I wanted to call it out here.

@BPirkle one last thing (sorry): the original requirements had "delta" and "size" in "characters". What I meant was, "do it exactly like the Web UI does it". Reviewing, the size and delta are in bytes in the Web UI, and I think we should do the same here. (I tried adding multi-byte chars to pages and it looks like it's definitely a difference in bytes).
I don't know if you did it that way in the first place, but I wanted to call it out here.

It's using rev->rev_len field directly. This is the same as what the UI and Action API is using. It's documented in code as 'bogo-bytes'. For TextContent that will be the size of the text in bytes. But when we consider other content types, things get a bit more interesting. For Flow talk pages that would always be 1. For some of the wikibase content it will be the byte size of serialized PHP objects.

TLDR, this is already the same thing as Web UI is showing, but it's not bytes. I guess documentation should reflect 'bogo-bytes' nature of this number.

Of the three, I prefer a null value. For clients, it's kind of irritating to have properties blink in and out of existence. The empty string is not really a good way to indicate "does not exist".

I will sound like a broken record... but I disagree :) Here's my reasoning:

  • Google style guide
  • Microsoft doesn't state preference regarding individual properties, but ask for omitting the resources where permissions are insufficient, which is close enough to our case.
  • In RESTBase APIs we are using JSON-Schema to document responses. It's a very useful tool and it's pretty much a standard way to document responses, so I imagine we might want to schema our responses down the road. JSON Schema doesn't really have a good abstraction for nullable types, so we would only be able to document it as ugly polymorthic types.
  • If we were to use openAPI for describing the REST APIs, support for polymorphic types there is even worse.
  • On our scale saving 25 bytes per revision might be of a value :)))

I do not have a very strong opinion here though. Current merged patch has it omitted.

I appreciate the links and understand the If your opinion is not strong, I'd like to stick with null.

For client developers, this kind of pseudocode:

obj = JSON::parse(HTTP::get(url))

if (has_property(obj, "comment")) {
  print obj->comment
}

...is weirder than:

obj = JSON::parse(HTTP::get(url))

if (obj->comment) {
  print obj->comment
}

And, yes, I know that lots of languages don't treat null as falsy, nor is it really a good idea to just use a decoded JSON object directly, but you get the point.

So, tl;dr it's easiest for client developers to have a null property rather than a missing (undefined) property. If you don't feel strongly about it, I'd like to use null.

Change 542292 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/core@master] Set unknown/restricted properies to null.

https://gerrit.wikimedia.org/r/542292

The above patch should include nulls instead of omitting the properties.

Change 542292 merged by jenkins-bot:
[mediawiki/core@master] Set unknown/restricted properies to null.

https://gerrit.wikimedia.org/r/542292

Change 542450 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/core@master] REST: Improve access checks for history and revision handlers.

https://gerrit.wikimedia.org/r/542450

Change 542450 merged by jenkins-bot:
[mediawiki/core@master] REST History and compare endpoints followups.

https://gerrit.wikimedia.org/r/542450