Page MenuHomePhabricator

Make EntityParserOutputDataUpdater configurable
Closed, ResolvedPublic

Event Timeline

Pablo-WMDE renamed this task from Make ReferencedEntitiesDataUpdater configurable to Make EntityParserOutputDataUpdater configurable.
Pablo-WMDE triaged this task as Medium priority.
Jakob_WMDE subscribed.

Random notes from an initial investigation

  • EntityParserOutputDataUpdater should probably not know about StatementDataUpdater and SiteLinkDataUpdater.
  • ReferencedEntitiesDataUpdater
    • split into 3 classes: one actual ReferencedEntitiesDataUpdater, one service that collects entity ids from statements, one service that collects entity ids from sitelink badges
    • could either take a list of entity ids from per-entity type collector classes, or take a per-entity service that takes the entity and scrapes its entity ids

More concrete plan:

  • rename EntityParserOutputDataUpdater to EntityParserOutputDataUpdaterCollection (naming suggestions welcome)
    • it should only be used to register EntityParserOutputDataUpdaters, pass them the entity they should operate on, and call updateParserOutput on them
    • processStatements and processSitelinks should go into concrete EntityParserOutputDataUpdater implementations
  • make EntityParserOutputDataUpdater an interface that operates on entities and updates the ParserOutput
    • has methods processEntity and updateParserOutput
  • existing StatementDataUpdaters should either be changed to implement EntityParserOutputDataUpdater instead, or they need an adapter (is that better?)
    • note: this is worse performance-wise. Instead of iterating over all statements once and passing individual statements to StatementDataUpdaters we would need to iterate over all statements once per StatementDataUpdater
    • everything concerning SiteLinks and Statements is removed from ReferencedEntitiesDataUpdater - it should only deal with EntityIds and add their respective links to ParserOutput
      • ReferencedEntitiesDataUpdater gets a per entity type collection of EntityIdCollectors (naming suggestions welcome) injected
      • the SiteLink bit of it becomes a SiteLinkBadgeItemIdCollector
      • the Statement bit becomes StatementReferencedEntityIdCollector

Change 442864 had a related patch set uploaded (by Jakob; owner: Jakob):
[mediawiki/extensions/Wikibase@master] Refactor ParserOutputDataUpdater

https://gerrit.wikimedia.org/r/442864

Change 443104 had a related patch set uploaded (by Jakob; owner: Jakob):
[mediawiki/extensions/Wikibase@master] Make EntityReferenceExtractors configurable per entity type

https://gerrit.wikimedia.org/r/443104

Change 442864 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Refactor ParserOutputDataUpdater

https://gerrit.wikimedia.org/r/442864

Change 443451 had a related patch set uploaded (by Jakob; owner: Jakob):
[mediawiki/extensions/Wikibase@master] Clean up around EntityParserOutputDataUpdater related code

https://gerrit.wikimedia.org/r/443451

Change 443451 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Clean up around EntityParserOutputDataUpdater related code

https://gerrit.wikimedia.org/r/443451

Change 443104 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Make EntityReferenceExtractors configurable per entity type

https://gerrit.wikimedia.org/r/443104