Page MenuHomePhabricator

Remove deprecation notice for parsing and diff options in ApiQueryRevisionsBase
Closed, DeclinedPublic

Description

The deprecation appears to be a hasty attempt to close a number of tickets by ignoring the underlaying issue. It was not an advancement of the API in any way.

As a result of this deprecation, clients need to hammer Wikimedia servers with many, many requests in order to retrieve data that was previously obtained via cache and in bulk.

I propose that we remove the deprecation warning and fix the underlaying issues with caching so that, as intended, ApiQueryRevisionsBase returns the diffs from a given list of revisions.

See: T164106

Event Timeline

Anomie added a subscriber: Anomie.

T164106: Deprecate parsing and diff options in ApiQueryRevisionsBase explains the reasoning for this deprecation, including the fact that the deprecated features work inconsistently with limits and continuation and the fact that it has historically resulted in code duplication and mismatch of features.

As a result of this deprecation, clients need to hammer Wikimedia servers with many, many requests in order to retrieve data that was previously obtained via cache and in bulk.

Except they don't, as rvparse forced rvlimit=1 and attempts to fetch more than one diff at a time would return "notcached" for all after the first thanks to $wgAPIMaxUncachedDiffs being 1.

Also action=compare uses cache in the same ways that rvdiffto did. And action=parse will use the parser cache where rvparse did not!

The deprecation appears to be a hasty attempt to close a number of tickets by ignoring the underlaying issue.

It wasn't hasty. There was analysis of usage and a post to the mediawiki-api mailing list, and after leaving time for feedback it was implemented.

The "a number of tickets" assertion is also incorrect by the usual definition of "a number of" meaning "several".

  • The only task that was closed thanks to the deprecation instead of doing work was T31223: Querying for rvdiffto=prev fails for many revids: "notcached". I can't see any way that could have been solved directly beyond having it force rvlimit=1 (or $wgAPIMaxUncachedDiffs) like rvparse did.
  • T157183 and T28534 were asking for cleanup of duplicated code, and were closed because the duplicated code was cleaned up (just via deprecation of the duplicate code rather than as proposed in those tasks).
  • The rest of the relevant tasks linked to T164106 weren't closed by the deprecation, although their descriptions were changed to refer to the replacement actions rather than prop=revisions. Several related to action=compare were implemented and closed, though, as part of updating that module to be a fully functional replacement for the diffing in prop=revisions.

Your assertion that attempts to fetch more than one diff at a time will fail is demonstrably untrue. I was using bulk queries of 20 diffs at a time, and I fully understood the cache issue. The cache issue would affect the query only if one of the given diffs were not in the cache.

There was an analysis of usage. What did the analysis show for rvdiffto, which was also deprecated? It looks like that ApiAction was called 15640856 times, which would translate to 15640856 * X calls to action=compare.

You said on mediawiki-api that you would rethink if anyone had "major objections".

Also, looking through the comment history, there are already objections in threads where the deprecation was suggested:
https://phabricator.wikimedia.org/T31223
https://phabricator.wikimedia.org/T164106

The objections mirror my own.

Your assertion that attempts to fetch more than one diff at a time will fail is demonstrably untrue. I was using bulk queries of 20 diffs at a time, and I fully understood the cache issue. The cache issue would affect the query only if one of the given diffs were not in the cache.

You're right, I misread the code and when I tested it on my own wiki I ran into T216554 without realizing it.

OTOH, it's not terribly often that diffs are already cached unless you're hitting the same revisions multiple times. Why would you be doing that?

You said on mediawiki-api that you would rethink if anyone had "major objections".

I said that almost two years ago, yes. You're a bit late.

And you've yet to raise any major issues, just minor ones to which you're ascribing undue importance.

You could be right.

My concern is that I don't want to cause undue load on Wikipedia or appear to be running a denial-of-service attack.

At this moment, there are 146 edits per minute on English Wikipedia. If you honestly think that hitting the compare endpoint that many times is a minor concern, I will modify my application to work in that manner.

You had at least two objections at the time, and apparently you added the deprecation notice despite not realizing people were using it to make multiple diff queries.

I think that alone could be reason to remove the deprecation notice then. People weren't exactly scrambling to add a deprecation notice to the endpoint.

But again, I'm saying all this under the presumption that hitting compare a large number of times has a non-negligible amount of overhead compared to the endpoints that were deprecated.

@Elephanthunter: Please read https://www.mediawiki.org/wiki/Bug_management/Phabricator_etiquette and stick to technical arguments instead of getting personal in the initial version of your last comment. Thanks a lot.

The Action API on Wikimedia sites is currently handling around 407000 requests per minute (around 150000 of those just for enwiki). An extra 146 isn't likely to even be noticed.

Just follow the usual etiquette recommendation to send your requests serially rather than in parallel and you should be fine.

@Anomie Those are some honestly impressive stats. Pretty cool. It looks like the API change isn't a problem then.