Page MenuHomePhabricator

[API] WikibaseClient: wblistentityusage API module adds its results to the `query.pages` key in response
Closed, ResolvedPublic5 Estimated Story PointsPRODUCTION ERROR

Description

Problem:

When using wblistentityusage as a value of the list parameter in an API query action, results are added to the query.pages key in the response object. However, this key is reserved for usage with any of the following parameters: generator / titles / pageids / revids. This could lead to a collision when using alongside one of the parameters listed and potentially cause a runtime exception.

{
    "error": {
        "code": "internal_api_error_RuntimeException",
        "info": "[267db349-f489-482e-9a01-1d2e6e66654d] Caught exception of type RuntimeException",
        "errorclass": "RuntimeException"
    },
    "servedby": "mw1392"
}

Example:

See this output for list=wblistentityusage, list=allpages, titles=Q1. It is evident from the response that list=allpages’ output is under allpages; pages contains two pages that look different and come from different sources, Q1 from titles=Q1 and User:Yiyi from list=wblistentityusage.

Desired behavior:

Results from usage of wblistentityusage are added under a different key than query.pages, potentially: query.entityusage.


Error
normalized_message
[{reqId}] {exception_url}   RuntimeException: Conflicting keys (pageid, ns, title) when attempting to merge element 63735107
exception.trace
from /srv/mediawiki/php-1.38.0-wmf.19/includes/api/ApiResult.php(304)
#0 /srv/mediawiki/php-1.38.0-wmf.19/includes/api/ApiResult.php(412): ApiResult::setValue(array, integer, array, integer)
#1 /srv/mediawiki/php-1.38.0-wmf.19/extensions/Wikibase/client/includes/Api/ApiListEntityUsage.php(132): ApiResult->addValue(array, integer, array)
#2 /srv/mediawiki/php-1.38.0-wmf.19/extensions/Wikibase/client/includes/Api/ApiListEntityUsage.php(111): Wikibase\Client\Api\ApiListEntityUsage->formatPageData(stdClass, string, array, ApiResult)
#3 /srv/mediawiki/php-1.38.0-wmf.19/extensions/Wikibase/client/includes/Api/ApiListEntityUsage.php(57): Wikibase\Client\Api\ApiListEntityUsage->formatResult(Wikimedia\Rdbms\MysqliResultWrapper, integer, array, NULL)
#4 /srv/mediawiki/php-1.38.0-wmf.19/extensions/Wikibase/client/includes/Api/ApiListEntityUsage.php(46): Wikibase\Client\Api\ApiListEntityUsage->run()
#5 /srv/mediawiki/php-1.38.0-wmf.19/includes/api/ApiQuery.php(629): Wikibase\Client\Api\ApiListEntityUsage->execute()
#6 /srv/mediawiki/php-1.38.0-wmf.19/includes/api/ApiMain.php(1889): ApiQuery->execute()
#7 /srv/mediawiki/php-1.38.0-wmf.19/includes/api/ApiMain.php(868): ApiMain->executeAction()
#8 /srv/mediawiki/php-1.38.0-wmf.19/includes/api/ApiMain.php(839): ApiMain->executeActionWithErrorHandling()
#9 /srv/mediawiki/php-1.38.0-wmf.19/api.php(90): ApiMain->execute()
#10 /srv/mediawiki/php-1.38.0-wmf.19/api.php(45): wfApiMain()
#11 /srv/mediawiki/w/api.php(3): require(string)
#12 {main}
NOTE: This is a breaking change. Announcement is being prepared here.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Another issue I happened to notice while looking through APIs for m3api work, like T300458; and like that one, fixing this task would probably require a breaking API change, which I’m not sure is worth it.

Probably also worth keeping in mind another known issue with this module, which is that its wbeu parameter prefix actually collides with another module (T196962). If we’re making a breaking change to this module, we might as well make several at once and kill two birds with one stone, I’d say.

Lucas_Werkmeister_WMDE changed the subtype of this task from "Task" to "Production Error".Jan 31 2022, 12:35 PM
Lucas_Werkmeister_WMDE set Request URL to https://www.wikidata.org/w/api.php.
Lucas_Werkmeister_WMDE updated the task description. (Show Details)
Lucas_Werkmeister_WMDE set Release Version to 1.38.0-wmf.19.
Lucas_Werkmeister_WMDE set Phatality ID to db75dd6d58c18f6c1926c722b160d8be7d119f22c4109539342efdf95349acb9.

Prio Notes:

  • Affects Users / Live Systems (Consult product)
  • Doesn't affect development efforts
  • Doesn't affect onboarding efforts
  • Does not affect additional stakeholders
ItamarWMDE renamed this task from WikibaseClient wblistentityusage API module adds to pages output when not used as generator, potentially throws RuntimeException to [SW] WikibaseClient wblistentityusage API module adds to pages output when not used as generator, potentially throws RuntimeException.Feb 15 2023, 8:59 AM
ItamarWMDE moved this task from Incoming to [DOT] Prioritized on the wmde-wikidata-tech board.
ItamarWMDE added a project: Wikidata Dev Team.

Story Writing Notes:

  • Reduce the amount of information in the ticket description by moving it to the comments
  • Clarify the meaning of using a list action api module as a generator

Clarify the meaning of using a list action api module as a generator

I’ll try to explain it :)

In the action=query API, when you use a list module normally, its output gets added to a separate field in the response – e.g. an API request for list=allpages|random|recentchanges (i.e. the allpages, random and recentchanges lists) will have three arrays in the response, query.allpages, query.random and query.recentchanges. Each module is responsible for what the output looks like, and they often have some kind of prop parameter controlling which properties are included in the output for each result: compare default list=recentchanges, recentchanges with empty rcprop, and recentchanges with full rcprop. (Also, some lists return results that aren’t pages, e.g. logevents or tags.)

Most lists that return pages can also be used as generators, which means you pass them into the generator parameter instead of the list parameter. In this case, the list module has no control over what the output looks like (and parameters like rcprop above have no effect): the module only generates a set of pages internally, and then another part of the action=query API is responsible for adding these pages to the output, under query.pages. The output of these pages will then depend on the overall prop parameter, and the possible prop values are another kind of API module: these modules get given a page and produce some properties to add to the output. This way, prop and list API modules are decoupled from one another, but can still be used together: for example, generator=geosearch&prop=cirrusdoc returns the CirrusSearch documents for pages near another page, even though the API modules are provided by different extensions that don’t know anything about each other – the GeoData extension generated a list of pages, and then the CirrusSearch extension added a prop for each of those pages.

I think that’s the most important part; some details that I’ve glossed over, but which you can look into if you’re interested, include the fact that some prop modules can also be used as generators, that generators and non-generators can be used in the same API request, that generators can also generate revisions instead of pages, and that generators can also be used in some other API modules (other than action=query, that is). I also recommend playing around with the different modules in the API sandbox – there you can see which modules are available, and what their parameters are.

As far as I can tell, this task and T196962 are currently the only problems with wblistentityusage that need to be fixed with user compatibility in mind. (We thought T254334 might need to be fixed at the same time, but managed to find a different fix for that. T322431 is a minor issue that can certainly be fixed internally with no impact on API users, at a later time.) Previously, I thought that the output code of this module was generally broken, and that it would be best to rewrite it as a new API module, which would also provide a way to update the API without breaking users (see the plan in T254334#8570415). After looking at the code a bit more, and also realizing that the new API module would have more issues than I anticipated (T254334#8570745), I no longer advocate for a new API module. Instead, I now have an idea for how we can solve both issues at the same time within the existing API module.

To solve the issue of the wbeu parameter prefix being ambiguous (T196962), I propose we temporarily use both prefixes for the API module: wbeu and wbleu. (The new prefix could also be anything else, but wbleu for wblistentityusage makes sense to me.) Technically, this would mean that we register the API module with the empty string as the parameter prefix, add wbeu to the name of every individual parameter, and then also add a copy of the parameters where the name begins with wbleu instead. And then we check which set of parameters a given request used, and use this information for the output format. If a request specified wbeu parameters, we respond with the old output format (pages in query.pages), and include a deprecation warning. If a request specified wbleu parameters, we respond with the new output format (pages in a separate container, e.g. query.entityusage). (Mixing wbeu and wbleu parameters in the same request should result in a hard error.) API users can migrate from wbeu to wbleu, and read response.query.entityusage instead of response.query.pages, when it’s convenient for them; after a while, we drop support for the wbeu parameters and the old output format.

How does this plan sound?

ItamarWMDE renamed this task from [SW] WikibaseClient wblistentityusage API module adds to pages output when not used as generator, potentially throws RuntimeException to WikibaseClient: wblistentityusage API module adds to results to the `query.pages` key in response.Feb 21 2023, 8:45 AM
ItamarWMDE updated the task description. (Show Details)
Original Ticket

(as written by @Lucas_Werkmeister)

WikibaseClient wblistentityusage API module adds to pages output when not used as generator, potentially throws RuntimeException

I’ve noticed that the list=wblistentityusage API, when it’s not used as a generator, still adds its pages to the pages output container in the result. This is confusing – usually, list modules have their own output container, and pages is reserved for titles/pageids/revids and generator. See this output for list=wblistentityusage, list=allpages, titles=Q1:

{
    "batchcomplete": "",
    "continue": {
        "wbeucontinue": "708|Q1|C.P969",
        "apcontinue": "Q100",
        "continue": "-||"
    },
    "query": {
        "pages": {
            "129": {
                "pageid": 129,
                "ns": 0,
                "title": "Q1"
            },
            "578": {
                "ns": 2,
                "title": "User:Yiyi",
                "pageid": 578,
                "wblistentityusage": {
                    "Q1": {
                        "aspects": [
                            "L"
                        ]
                    }
                }
            }
        },
        "allpages": [
            {
                "pageid": 129,
                "ns": 0,
                "title": "Q1"
            }
        ]
    }
}

list=allpages’ output is under allpages; pages contains two pages that look different and come from different sources, Q1 from titles=Q1 and User:Yiyi from list=wblistentityusage.

If you force a collision between the two kinds of pages, it gets worse. Item Q1000011 happens to only be used on a single page, User talk:Epìdosis/Archive/2019; wbeuentities=Q1000011&titles=User_talk:Epìdosis/Archive/2019, which would therefore add the page to pages from both sources, produces an exception:

{
    "error": {
        "code": "internal_api_error_RuntimeException",
        "info": "[267db349-f489-482e-9a01-1d2e6e66654d] Caught exception of type RuntimeException",
        "errorclass": "RuntimeException"
    },
    "servedby": "mw1392"
}

Note that this API module as already known to crash when used as a generator: T254334: WikibaseClient wblistentityusage API module throws RuntimeException when used as generator

ItamarWMDE renamed this task from WikibaseClient: wblistentityusage API module adds to results to the `query.pages` key in response to WikibaseClient: wblistentityusage API module adds its results to the `query.pages` key in response.Feb 21 2023, 8:47 AM
ItamarWMDE updated the task description. (Show Details)
ItamarWMDE updated the task description. (Show Details)

Change 893507 had a related patch set uploaded (by Hoo man; author: Hoo man):

[mediawiki/extensions/Wikibase@master] ApiListEntityUsage: Very minor clean up

https://gerrit.wikimedia.org/r/893507

Change 893508 had a related patch set uploaded (by Hoo man; author: Hoo man):

[mediawiki/extensions/Wikibase@master] wblistentityusage: Deprecate wbeu prefix, new output format

https://gerrit.wikimedia.org/r/893508

Change 893507 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] ApiListEntityUsage: Very minor clean up

https://gerrit.wikimedia.org/r/893507

ItamarWMDE renamed this task from WikibaseClient: wblistentityusage API module adds its results to the `query.pages` key in response to [API] WikibaseClient: wblistentityusage API module adds its results to the `query.pages` key in response.Mar 14 2023, 1:31 PM
karapayneWMDE subscribed.

From Sprint 5 Review- Task is reviewed and ready. Blocked until the announcement (https://phabricator.wikimedia.org/T330130) is ready

Change 914297 had a related patch set uploaded (by Michael Große; author: Hoo man):

[mediawiki/extensions/Wikibase@wmf/1.41.0-wmf.7] wblistentityusage: Deprecate wbeu prefix, new output format

https://gerrit.wikimedia.org/r/914297

Change 893508 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] wblistentityusage: Deprecate wbeu prefix, new output format

https://gerrit.wikimedia.org/r/893508

Change 914437 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Hoo man):

[mediawiki/extensions/Wikibase@wmf/1.41.0-wmf.6] wblistentityusage: Deprecate wbeu prefix, new output format

https://gerrit.wikimedia.org/r/914437

Change 914297 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@wmf/1.41.0-wmf.7] wblistentityusage: Deprecate wbeu prefix, new output format

https://gerrit.wikimedia.org/r/914297

Mentioned in SAL (#wikimedia-operations) [2023-05-03T10:00:32Z] <lucaswerkmeister-wmde@deploy1002> Started scap: Backport for [[gerrit:914297|wblistentityusage: Deprecate wbeu prefix, new output format (T300460 T196962)]]

Mentioned in SAL (#wikimedia-operations) [2023-05-03T10:18:46Z] <lucaswerkmeister-wmde@deploy1002> lucaswerkmeister-wmde and migr: Backport for [[gerrit:914297|wblistentityusage: Deprecate wbeu prefix, new output format (T300460 T196962)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-05-03T10:35:25Z] <lucaswerkmeister-wmde@deploy1002> Finished scap: Backport for [[gerrit:914297|wblistentityusage: Deprecate wbeu prefix, new output format (T300460 T196962)]] (duration: 34m 53s)

Change 914437 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@wmf/1.41.0-wmf.6] wblistentityusage: Deprecate wbeu prefix, new output format

https://gerrit.wikimedia.org/r/914437

Mentioned in SAL (#wikimedia-operations) [2023-05-03T13:24:30Z] <lucaswerkmeister-wmde@deploy1002> Started scap: Backport for [[gerrit:914437|wblistentityusage: Deprecate wbeu prefix, new output format (T300460 T196962)]]

Mentioned in SAL (#wikimedia-operations) [2023-05-03T13:41:59Z] <lucaswerkmeister-wmde@deploy1002> lucaswerkmeister-wmde: Backport for [[gerrit:914437|wblistentityusage: Deprecate wbeu prefix, new output format (T300460 T196962)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-05-03T13:52:25Z] <lucaswerkmeister-wmde@deploy1002> Finished scap: Backport for [[gerrit:914437|wblistentityusage: Deprecate wbeu prefix, new output format (T300460 T196962)]] (duration: 27m 54s)

This is done, isn’t it? (The cleanup ApiListEntityUsage: Remove legacy prefix was attached to the sibling task T196962.)