Page MenuHomePhabricator

Viewing first revision via API gives baddiff error
Closed, ResolvedPublic

Description

Probably related to T201218 . Using the compare API on a page's first revision gives a baddiff error.

Example: https://en.wikipedia.org/w/api.php?action=compare&format=json&fromrev=12636721&prop=user%7Cids%7Ccomment%7Cparsedcomment&torelative=prev

Event Timeline

Yair_rand created this task.Sep 4 2018, 8:01 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 4 2018, 8:01 AM
Anomie added a subscriber: Anomie.

This isn't related to T201218.

There are three ways we could go here:

  1. Improve the error message to say that there's no previous revision to diff to.
  2. Follow the lead of the web UI and generate a somewhat silly empty diff.
  3. Try to generate a diff that shows everything being added.

I note that the same thing happens with torelative=next, but in that case there's nothing that would make sense for option #3.

Personally I'm inclined to go with option #1. For the web UI it makes sense to still show the parsed version of the page's first revision and the header with the user name and edit summary. But for the API you have to use action=parse to get that parsed version anyway, so there's not too much action=compare could usefully return. But I could be convinced otherwise.

This worked until recently. It showed a diff of everything that was added, which was quite useful. (My userscript relied on it, and is now getting errors as a result of this change.) Requiring a different API call to get all the added lines from the first revision as from all other revisions seems needlessly complicated.

(Are you sure this isn't related to T201218? They began to fail at the same time and are very similar issues on the surface.)

I agree with @Yair_rand, this did work until recently, so I'm not sure if the feature was unintentional, but it was being used that way in InteractionTimeline until it broke a few weeks ago.

Anomie moved this task from Needs details or plan to Needs Review on the MediaWiki-API board.

(Are you sure this isn't related to T201218? They began to fail at the same time and are very similar issues on the surface.)

I'm sure. It's due to rMW1ab2f7a56ba5: ApiComparePages: Update for MCR rather than to any change in DifferenceEngine as that task was.

Change 469041 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/core@master] ApiComparePages: Don't error with no prev/next rev

https://gerrit.wikimedia.org/r/469041

I'm sure. It's due to rMW1ab2f7a56ba5: ApiComparePages: Update for MCR rather than to any change in DifferenceEngine as that task was.

Thanks so much for finding the problem and writing a solution!

@Anomie I've been thinking about what the API should do in this scenario, and I'm not sure a warning is the best way to go. A warning implies to the user, that they did something wrong, or there is something they need to change about their request. This seems consistent with our own documentation that states:

Warnings are thrown for non-fatal conditions such as invalid parameters, whereas errors are only thrown for fatal conditions.

https://www.mediawiki.org/wiki/API:Errors_and_warnings?oldid=2891928

In the same way, the user should be able to "do something" in order to remove the warning.

There isn't anything about a request like this one:
https://test.wikipedia.org/w/api.php?action=compare&fromrev=338497&torelative=prev&formatversion=2&format=json&origin=*&prop=diff|user|ids
that is incorrect. There isn't anything that needs to be changed or altered in order to remove the warning.

Perhaps it would be more appropriate to either return a null-state (perhaps null or 0)? We could also add another field like tomissing: true and frommissing: true in order to make it explicit that the to or from is missing. Would either of these solutions be acceptable?

Legoktm added a subscriber: Legoktm.Nov 1 2018, 3:55 AM

(I haven't had the time to deeply look into this issue yet, so this is more a surface level comment since this is a 1.32 regression)

This isn't related to T201218.
There are three ways we could go here:

  1. Improve the error message to say that there's no previous revision to diff to.
  2. Follow the lead of the web UI and generate a somewhat silly empty diff.
  3. Try to generate a diff that shows everything being added.

I note that the same thing happens with torelative=next, but in that case there's nothing that would make sense for option #3.
Personally I'm inclined to go with option #1. For the web UI it makes sense to still show the parsed version of the page's first revision and the header with the user name and edit summary. But for the API you have to use action=parse to get that parsed version anyway, so there's not too much action=compare could usefully return. But I could be convinced otherwise.

I'd agree with you if this were a new feature, except this is functionally a regression, in which case I think the correct thing to do is to restore the previous behavior (or emulate it as close to reasonably possible). So I'd prioritize back-compat first since AIUI it wasn't a intended breaking change.

@Anomie I've been thinking about what the API should do in this scenario, and I'm not sure a warning is the best way to go. A warning implies to the user, that they did something wrong, or there is something they need to change about their request. This seems consistent with our own documentation that states:

Warnings are thrown for non-fatal conditions such as invalid parameters, whereas errors are only thrown for fatal conditions.

https://www.mediawiki.org/wiki/API:Errors_and_warnings?oldid=2891928
In the same way, the user should be able to "do something" in order to remove the warning.

Agreed with the principle. But I think Anomie is saying that the request is just conceptually flawed?

I think Anomie is saying that the request is just conceptually flawed?

Sure. And that's a valid argument. Clearly we disagree, but I don't think that disagreement should hold up fixing the regression. My request is that the warning is removed (since that introduces a new behavior) and a discussion about the validity of the request can be had after the regression is fixed.

Anomie added a comment.Nov 1 2018, 3:26 PM

I'd agree with you if this were a new feature, except this is functionally a regression, in which case I think the correct thing to do is to restore the previous behavior (or emulate it as close to reasonably possible). So I'd prioritize back-compat first since AIUI it wasn't a intended breaking change.

When I looked into it deeper to write the patch, I came to agree with you on this. I think when I wrote that earlier comment I hadn't realized that it was indeed a regression.

Which is why the patch up for review does restore the previous behavior.

If I read the discussion on the patch correctly, this ticket currently blocks the release of 1.32 on the question of whether the API should issue a warning when producing a diff against a revision that does not exist. This seems disproportionate. I think neither solution (having or not having the warning) is so problematic that it should be prevented from being in the release. I would default to restoring the old behavior as-is, even if I didn't particularly like it. On the other hand, if the patch author insists on having the warning, I wouldn't block on that either. In theory, this should be decided by the product owner of the API - which is the job that @EvanProdromou just started. But really, it just doesn't seem big enough of a deal to me to block on it.

For the record, here is my own prefference on how this should behave

  • torelative=prev on the first revision should diff against an empty revision, with no warning: the desire to see the diff that represents the first edit is perfectly cromulent, and this is the easiest way to get it. There is nothing to warn about.
  • torelative=next on the last revision should issue a warning. It may return an empty diff, or nothing. I'd even be fine with an error, but perhaps that would break client expectations. In any case, asking for the diff of the latest revision against the next revision is pointless, and the information the client should receive in some way is "hey, there is nothing to diff against".

In summary, this seems to be a disagreement with minor impact, in need of someone to make a call. If I was to make the call, it would be to restore behavior to what it was previously, make the release, and then discuss further, after thinking hard on whether that's a good use of our time.

Change 469041 merged by jenkins-bot:
[mediawiki/core@master] ApiComparePages: Don't error with no prev/next rev

https://gerrit.wikimedia.org/r/469041

Change 480504 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/core@REL1_32] ApiComparePages: Don't error with no prev/next rev

https://gerrit.wikimedia.org/r/480504

Anomie closed this task as Resolved.Dec 18 2018, 2:48 PM
Anomie claimed this task.

Change 480504 merged by jenkins-bot:
[mediawiki/core@REL1_32] ApiComparePages: Don't error with no prev/next rev

https://gerrit.wikimedia.org/r/480504

Change 480662 had a related patch set uploaded (by Mooeypoo; owner: Mooeypoo):
[mediawiki/core@master] Followup Ie81b58c: Remove API warning when at the end/start of revs

https://gerrit.wikimedia.org/r/480662