Page MenuHomePhabricator

Fix wikidata entity usage tracking and access count
Open, Needs TriagePublic

Description

As a technical contributor, I want to see an accurate count of Wikibase entities accessed to assess the page based on the parser performance report.

Problem:
In the parser performance report, the count resets to zero when it reaches the maximum limit (400) and begins a new count which only reports the number after the reset. E.g. if there are 411 entities, it would report only 11.

Example:

BDD
GIVEN user navigates to a page with more than 400 wikidata entity accesses (e.g. very kind sandbox example.)

  • User sees the Lua error message: Too many Wikidata entities accessed.
  • User can confirm that there lots of wikidata entities cited in the page (in this example, Wolff's Revier (Q566440) has 464 cast members (P161).)
  • User navigates to the NewPP limit report in the page source, and sees Number of Wikibase entities loaded: 0/400

As the page attempts to access the first 400 cast members of Wolff's Revier , we can see that the 0 count is incorrect. It has been reset after issuing the Lua error.

Acceptance criteria:
When a user has access more than 400 Wikidata entities in one page, the count is not reset and shows in the NewPP limit report the real number out of 400.

Important Notes

  1. Risks identified: If we make the count operate correctly, it means that after the limit has been extended, there will be no reset and attempts to access an entity will cause an error.
  2. In cases where we have a single call that tries to access more than 400 entities e.g. inflows of the pacific ocean, they are valid additions to the count of entities access even if the page shows an error instead of the entity and we believe very few pages have more than 400 Wikibase entities accessed.

However, we do not expect there will be many new errors after the fix, this is because if the page does exceed the entity accessed limit, it would already show an existing error.

Notes about entity properties
We previously believed that statement values are not represented in the count when the values are also Wikibase entities.

However, we found out that whether the entity is included in the count depends on how the entities are being accessed.

In the implementation where we are using the template from user @Strainu {{Listă de la Wikidata/test|pid=P6|qid=Q1}}, each of the values of the property (P6) are being checked for its statements ("a list (P2354)" or is "coextensive with (P3403)"), this is considered accessing the entity. Therefore, this is correct to include in the entity access count. However, in the implementation where we directly access a property at an entity {{ #property: P6 | from = Q1}}, the property values are not being checked for their statements. That is why these entity statements are not represented in the count.

We feel that there is no work required in changing what's considered as an entity accessed.

Currently Affected Pages

  1. Tour de Berna
  2. Küneş
  3. Папуа — Çĕнĕ Гвиней
  4. Alpecin-Deceuninck
  5. UAE Team Emirates
  6. Team Visma-Lease a Bike
  7. Laghi del Trentino
  8. Campeonatos nacionais de ciclismo em estrada de 2019'

--Previous description--
This page (using this version of Infocaseta film and this version of Modul:Wikidata) throws a "Too many Wikidata entities accessed" because the P161 (distribution) property at Wikidata has hundreds of items and the Infobox tries to access them all.

While the error is correct, when checking the parser data in preview, I see "Number of Wikibase entities loaded 9/400", which is confusing. I believe there are 2 improvements than can be made:

  1. in the error message, explicitly mention the limit and the current value
  2. correctly report the number in the parser performance data

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

This is a very unideal situation. Wikidata_IB, one of the recommended packages, can trigger this error simply by having a script calling checkvalue, on a random item that looks empty in the Wikidata interface, even. To say this is confusing is an understatement.

Is it possible to at least state the property that triggered this error? Not providing any information in the error message is not fine by any standard.

Michael subscribed.

This wrong number of entities accessed sounds suspiciously like T341957: Wrong number of Wikibase entities in NewPP limit report.

Hi @Strainu, We're looking into this ticket and trying to replicate the error, but we're having difficulty. Could you please share any useful information regarding this? Thanks a lot!

Thank you @Strainu. We'll share some updates on our work on this soon :)

JoelyRooke-WMDE renamed this task from Confusing Lua error "Too many Wikidata entities accessed" to Fix wikidata entity usage tracking and access count.Sep 2 2024, 1:57 PM
JoelyRooke-WMDE updated the task description. (Show Details)

Change #1078679 had a related patch set uploaded (by Seanleong-wmde; author: Seanleong-wmde):

[mediawiki/extensions/Wikibase@master] Add tracking category for exceeded entity limit

https://gerrit.wikimedia.org/r/1078679

Change #1078679 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Add tracking category for exceeded entity limit

https://gerrit.wikimedia.org/r/1078679

seanleong-WMDE updated the task description. (Show Details)
seanleong-WMDE updated the task description. (Show Details)

Hey @seanleong-WMDE, I'm tagging the Product Platform team which could investigate the API issue during their planning meeting. Could you please briefly describe the API issue and let us know if it's the action API? Thank you

Change #1087936 had a related patch set uploaded (by Seanleong-wmde; author: Seanleong-wmde):

[mediawiki/extensions/Wikibase@master] Created RestrictedEntityLookupFactory as a service instead of RestrictedEntityLookup. Each parser will each have its respective RestrictedEntityLookup.

https://gerrit.wikimedia.org/r/1087936

Change #1091251 had a related patch set uploaded (by Seanleong-wmde; author: Seanleong-wmde):

[mediawiki/extensions/Wikibase@master] Fix wikidata entity usage tracking and access count

https://gerrit.wikimedia.org/r/1091251

Change #1087936 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Fix wikidata entity usage tracking and access count

https://gerrit.wikimedia.org/r/1087936

Change #1088225 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/extensions/Wikibase@master] Streamline RestrictedEntityLookup implementation

https://gerrit.wikimedia.org/r/1088225

When multiple parsers are created? Are these usages summarized to one?

This patch doesn't consider non-parser accesses to Wikidata Items. One extension is using it this way (and using limits introduced in I70497b61ba8c45bc322d9818e735f65aaa69f052).

@seanleong-WMDE please continue work on this and fix the additional entities being shown such that the users can see on the page from which part of the page onwards the maximum limit has been reached & which additional entities are not being loaded.

Change #1088225 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Streamline RestrictedEntityLookup implementation

https://gerrit.wikimedia.org/r/1088225