Page MenuHomePhabricator

Remove legacy EntityInfo based prefetching
Closed, ResolvedPublic

Description

This ticket was created after investigation done as part of T253125: Federated Properties, ApiEntityLookup, trying to get data that was not prefetched..
The bulk of the content of this ticket comes from the following 2 comments T253125#6162897 T253125#6163636 (more details there)

EntityInfo as a concept is not currently intentionally used and should be removed.
EntityInfo was introduced back in the day to "prefetch" entity info needed for rendering a page, this includes terms needed for rendering a page
Introduction of formattercache for term lookups stopped using EntityInfo in most cases
Despite not using the EntityInfo, it hasn't since been cleaned up / removed (I seem to remember there being a ticket relating to this? but I can't find it)

The EntityInfo lookup does appear to be used for some things such as formatting of coordinate globes or also numeric units.
These cases should just use the new thing.

Impact of removal

The current rate of requests to this EntityInfo builder stuff: https://grafana.wikimedia.org/d/u5wAugyik/wikibase-statsdrecordingsimplecache?panelId=2&fullscreen&orgId=1&from=now-1h&to=now
The rate of requests for EntityInfo is essentially the same as the # of parser output generations for wikibase data.
Looking at the cache metrics
In a 12 hour period this is ~120million calls, so 10 million an hour, or 166k per min or ~2700 calls per second
The cache hits come from local server cache, but the misses result in lookups from the terms storage db tables.

We also track the db calls for this code path that may not be needed on https://grafana.wikimedia.org/d/000000548/wikibase-sql-term-storage?orgId=1&refresh=30s
You can see these in the top panel for the "select.EntityInfoBuilder_collectTermsForEntities" metric.
Averaging 4.7k ops a second/minute (TBA confirm this??)

Needless to say that not doing all of this stuff that is not needed should speed things up and reduce memory consumption.
It will also remove an old outdated concept that should have been killed some while ago.

Event Timeline

Addshore created this task.Jun 2 2020, 9:22 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 2 2020, 9:22 PM

Change 601866 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/Wikibase@master] Deprecate everything relating to EntityInfo

https://gerrit.wikimedia.org/r/601866

Change 601878 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/WikibaseLexeme@master] DNM WIP Remove usage of EntityInfo

https://gerrit.wikimedia.org/r/601878

Change 601898 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/Wikibase@master] Don't use EntityInfo in ViewFactory

https://gerrit.wikimedia.org/r/601898

Addshore claimed this task.Jun 2 2020, 11:15 PM

When trying to remove things I ran into an issue with FullEntityParserOutputGeneratorTest.php::testGetParserOutputIncludesLabelsOfRedirectEntity

It seems some of the stuff in T96553 is related.

It seems the old EntityInfo based lookup stuff is used in this integration test, (even if it is not used in production).
The situation being tested is...

  • Q1 redirects to Q2.
  • Q2 has a label of "foo"
  • Q3 has a statement using Q1 as a value
  • The expected behaviour is that the value uses the label from Q2 (redirect target) even though the value on the entity is still Q1

Checking on test.wikidata.org this is currently the case https://test.wikidata.org/wiki/Q212354
Checking locally (with the patch removing entityinfo from view classes) this is also the case... just the test is evil?

Change 603572 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/core@master] CloneDatabase, add domain alias when prefix is changed

https://gerrit.wikimedia.org/r/603572

Addshore moved this task from incoming to in progress on the Wikidata board.Jun 9 2020, 9:12 AM

Change 604109 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/WikibaseMediaInfo@master] Remove usage of EntityInfo

https://gerrit.wikimedia.org/r/604109

Change 604121 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/Wikibase@master] Stop generating EntityInfo in ParserOutputGenerator

https://gerrit.wikimedia.org/r/604121

Change 604124 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/WikibaseLexeme@master] Remove $unused from VIEW_FACTORY_CALLBACK

https://gerrit.wikimedia.org/r/604124

Change 604125 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/WikibaseMediaInfo@master] Remove $unused from VIEW_FACTORY_CALLBACK

https://gerrit.wikimedia.org/r/604125

Change 601888 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/Wikibase@master] Remove EntityInfo and Builders everywhere

https://gerrit.wikimedia.org/r/601888

Change 601866 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Deprecate everything relating to EntityInfo

https://gerrit.wikimedia.org/r/601866

Change 604307 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/Wikibase@master] Remove Formatter OPT_LABEL_DESCRIPTION_LOOKUP option

https://gerrit.wikimedia.org/r/604307

Change 604309 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/Wikibase@master] Prepare to remove $unused from ViewFactory

https://gerrit.wikimedia.org/r/604309

Change 604310 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/WikibaseLexeme@master] Stop passing $unused to ViewFactory

https://gerrit.wikimedia.org/r/604310

Change 604312 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/Wikibase@master] Remove $unused backcompat in ViewFactory

https://gerrit.wikimedia.org/r/604312

Change 604109 merged by jenkins-bot:
[mediawiki/extensions/WikibaseMediaInfo@master] Remove usage of EntityInfo

https://gerrit.wikimedia.org/r/604109

Change 603572 merged by jenkins-bot:
[mediawiki/core@master] CloneDatabase, add domain alias when prefix is changed

https://gerrit.wikimedia.org/r/603572

Change 601898 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Don't use EntityInfo in ViewFactory

https://gerrit.wikimedia.org/r/601898

Change 601878 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Remove usage of EntityInfo

https://gerrit.wikimedia.org/r/601878

Change 604121 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Stop generating EntityInfo in ParserOutputGenerator

https://gerrit.wikimedia.org/r/604121

Change 604125 merged by jenkins-bot:
[mediawiki/extensions/WikibaseMediaInfo@master] Remove $unused from VIEW_FACTORY_CALLBACK and EntityInfo use

https://gerrit.wikimedia.org/r/604125

Change 604124 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Remove $unused from VIEW_FACTORY_CALLBACK

https://gerrit.wikimedia.org/r/604124

eprodromou added a subscriber: eprodromou.

Moving to Clinic Duty for external review.

Change 601888 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Remove EntityInfo and Builders everywhere

https://gerrit.wikimedia.org/r/601888

Change 604307 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Remove Formatter OPT_LABEL_DESCRIPTION_LOOKUP option

https://gerrit.wikimedia.org/r/604307

Change 604309 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Prepare to remove $unused from ViewFactory

https://gerrit.wikimedia.org/r/604309

Change 604310 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Stop passing $unused to ViewFactory

https://gerrit.wikimedia.org/r/604310

Change 604312 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Remove $unused backcompat in ViewFactory

https://gerrit.wikimedia.org/r/604312

The code reading from the cache has already been deployed.

Yay less complexity!

It may also have had a small impact in the parser output generation times (but hard to say and probably not worth diving into too deeply)

The last code changes for this ticket will be deployed this week with (.39)