Page MenuHomePhabricator

WikibaseClient wblistentityusage API module throws RuntimeException when used as generator
Closed, ResolvedPublic3 Estimated Story Points

Description

As a Wikidata editor, I want to be able to query for entity usage in SPARQL, through MWAPI. (Brought up here.)
As a developer, I want API modules that advertise themselves as being possible generators to be usable as generators, instead of producing errors in production.

Problem:
Wikibase Client’s wblistentityusage Action API module produces a RuntimeException when it’s used as a generator.

Example:
https://en.wikipedia.org/w/api.php?action=query&format=json&generator=wblistentityusage&gwbeuentities=Q1&gwbeulimit=1

RuntimeException from line 306 of /srv/mediawiki/php-1.35.0-wmf.34/includes/api/ApiResult.php: Conflicting keys (31880) when attempting to merge element pages
#0 /srv/mediawiki/php-1.35.0-wmf.34/includes/api/ApiResult.php(412): ApiResult::setValue(array, string, array, integer)
#1 /srv/mediawiki/php-1.35.0-wmf.34/includes/api/ApiQuery.php(426): ApiResult->addValue(string, string, array)
#2 /srv/mediawiki/php-1.35.0-wmf.34/includes/api/ApiQuery.php(251): ApiQuery->outputGeneralPageInfo()
#3 /srv/mediawiki/php-1.35.0-wmf.34/includes/api/ApiMain.php(1583): ApiQuery->execute()
#4 /srv/mediawiki/php-1.35.0-wmf.34/includes/api/ApiMain.php(523): ApiMain->executeAction()
#5 /srv/mediawiki/php-1.35.0-wmf.34/includes/api/ApiMain.php(494): ApiMain->executeActionWithErrorHandling()
#6 /srv/mediawiki/php-1.35.0-wmf.34/api.php(84): ApiMain->execute()
#7 /srv/mediawiki/w/api.php(3): require(string)
#8 {main}

The number of conflicting keys (apparently page IDs) reported varies with the gwbeulimit:
https://en.wikipedia.org/w/api.php?action=query&format=json&generator=wblistentityusage&gwbeuentities=Q1&gwbeulimit=5

Conflicting keys (31880, 23674715) when attempting to merge element pages

Screenshots/mockups:

BDD
GIVEN
AND
WHEN
AND
THEN
AND

Acceptance criteria:

  • The API module can be used as a generator.

Open questions:

Event Timeline

Change 608920 had a related patch set uploaded (by Umherirrender; owner: Umherirrender):
[mediawiki/extensions/Wikibase@master] Fix result for api module wblistentityusage

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/ /608920

To be usable as generator you have to extends ApiQueryGeneratorBase, that is done. You have to implment executeGenerator, that is done.

Conflicting keys (31880, 23674715) when attempting to merge element pages

31880 is the id of the page you have to list

23674715 seems to be the pageid of the next page.
31880 has currently 4 aspects, so setting the limit to 4 gives a continue with 23674715 - https://en.wikipedia.org/w/api.php?action=query&format=json&list=wblistentityusage&wbeuentities=Q1&wbeulimit=4

It seems you are mixing the current and the next pageids, when using limit = 4 it gives the wrong title/pageid in the result. (Key for query.pages is 31880, but the title/pageid shows 23674715)

{
    "batchcomplete": "",
    "continue": {
        "wbeucontinue": "23674715|Q1|L.en",
        "continue": "-||"
    },
    "query": {
        "pages": {
            "31880": {
                "ns": 2,
                "title": "User:GAllegre",
                "pageid": 23674715,
                "wblistentityusage": {
                    "Q1": {
                        "aspects": [
                            "C",
                            "O",
                            "S",
                            "T"
                        ]
                    }
                }
            }
        }
    }
}

The last group is using the wrong $row:

		if ( $entry ) {
			$this->formatPageData( $row, $currentPageId, $entry, $result );
		}

Have not tested the patch set

Change 608920 abandoned by Umherirrender:
[mediawiki/extensions/Wikibase@master] WIP: Fix result for api module wblistentityusage

Reason:
WIP

https://gerrit.wikimedia.org/r/608920

Prio Notes:

  • Affects Users / Production systems
  • Does not affect development efforts
  • Does not affect onboarding efforts
  • Might affect additional stakeholders (SRE, for logspam)

My memory of the WIP Gerrit change mentioned above is that this was fairly hard to understand, and I don’t think it will be an easy fix within the context of the existing API module; I suspect the error might also be due to this module’s unusual use of the output (using pages even when not used as a generator), which we can’t easily change.

I suggest that we solve this task, T300460 and T196962 together by creating a new “list entity usage” API module, separate from the existing wblistentityusage module. We can have both deployed side-by-side for a while, and tell users to switch to the new one, before eventually removing the old one. In the new module, we’ll take care that it can be used as a generator without error, avoiding this bug; we’ll change the output to not use pages when not used as a generator, fixing T300460; and we’ll use a different parameter prefix from wbeu, fixing T196962.

The naming gets a bit awkward, but I suggest:

  • We currently have prop=wbentityusage and list=wblistentityusage, both with prefix wbeu.
  • The new module should be list=wbentityusage, with prefix wbleu.

This means that we’ll have prop=wbentityusage and list=wbentityusage, but I think that’s okay, and still better than e.g. list=wblistentityusage2 or list=wbnewlistentityusage.

This means that we’ll have prop=wbentityusage and list=wbentityusage, but I think that’s okay

Hm, it might be a problem after all :/ the API usage message keys look like apihelp-query+wbentityusage-description – they don’t distinguish between prop=wbentityusage and list=wbentityusage. So we might have to find a different name for the list after all – but ideally one that won’t look too weird once the old module is gone…

Okay, there’s actually a really simple fix for this error in particular. I still think we should introduce a new version of the API module with a better output format, and try to find a solid naming scheme for it – but that doesn’t have to block the simpler fix I found, which adds literally one line to the API module (plus some more in the tests). Pulling into the sprint for review.

Edit: For the record, I changed my mind on introducing a new version of the API module – see T300460#8625492.

Change 884967 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@master] Fix wblistentityusage generator usage

https://gerrit.wikimedia.org/r/884967

Change 884967 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Fix wblistentityusage generator usage

https://gerrit.wikimedia.org/r/884967

karapayneWMDE set the point value for this task to 3.Feb 7 2023, 9:35 AM

\o/ example now works instead of errors. Thank you!