Page MenuHomePhabricator

Investigate and implement avoiding inserting not-needed entries in wb_changes
Closed, ResolvedPublic

Description

Currently, every edit inserts an entry in wb_changes table and this made sense during the time wikidata items were mostly used in Wikipedia but currently significant proportion of edits happen on items that don't have any wiki subscribed to them so dispatching doesn't make sense there at all.

This would:

  • Reduce the size of wb_changes and binlog in s8
  • Make the prune script faster
  • Reduce the load on master of s8
  • Reduce the replication load
  • Makes the deferred update for it faster (or if it's part of saving the edit, which it shouldn't, reduces edit saving time).

Event Timeline

Why do I look into stuff?

Every edits causes two write queries. One is right after save complete (and through WikibaseRepo::getChangeNotifier() functions) and one is after RC entry being injected which adds the bot flag (I kid you not). I got this from binlog of beta cluster belonging to one edit:

INSERT /* Wikibase\Lib\Store\Sql\SqlChangeStore::insertChange  */ INTO `wb_changes` (change_type,change_time,change_object_id,change_revision_id,change_user_id,change_info) VALUES ('wikibase-item~add','20210818130538','Q592574',1257533,0,'{\"compactDiff\":\"{\\\"arrayFormatVersion\\\":1,\\\"labelChanges\\\":[],\\\"descriptionChanges\\\":[],\\\"statementChanges\\\":[],\\\"siteLinkChanges\\\":[],\\\"otherChanges\\\":false}\",\"metadata\":{\"page_id\":859938,\"rev_id\":1257533,\"parent_id\":0,\"comment\":\"\\/* wbeditentity-create-item:0| *\\/\",\"user_text\":\"172.16.3.105\",\"central_user_id\":0}}')

UPDATE /* Wikibase\Lib\Store\Sql\SqlChangeStore::updateChange  */  `wb_changes` SET change_type = 'wikibase-item~add',change_time = '20210818130538',change_object_id = 'Q592574',change_revision_id = 1257533,change_user_id = 0,change_info = '{\"compactDiff\":\"{\\\"arrayFormatVersion\\\":1,\\\"labelChanges\\\":[],\\\"descriptionChanges\\\":[],\\\"statementChanges\\\":[],\\\"siteLinkChanges\\\":[],\\\"otherChanges\\\":false}\",\"metadata\":{\"page_id\":859938,\"rev_id\":1257533,\"parent_id\":0,\"comment\":\"\\/* wbeditentity-create-item:0| *\\/\",\"user_text\":\"172.16.3.105\",\"central_user_id\":0,\"bot\":0}}' WHERE change_id = 1776189

This seems to be known as the updater hook handler (RecentChangeSaveHookHandler) starts with:

Nasty hack to inject information from RC into the change...

Removing this nasty hack on itself would help simplifying the code and reducing the db load

Change 719253 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/extensions/Wikibase@master] Use UserIndetity directly instead of User object

https://gerrit.wikimedia.org/r/719253

Change 719284 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/extensions/Wikibase@master] [WIP] Introduce ChangeHolder and use that to store the change later

https://gerrit.wikimedia.org/r/719284

Change 719253 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Use UserIndetity directly instead of User object

https://gerrit.wikimedia.org/r/719253

Change 719337 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/extensions/Wikibase@master] Avoid inserting wb_changes entry when there is no subscriber

https://gerrit.wikimedia.org/r/719337

Change 719284 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Introduce ChangeHolder and use that to store the change later

https://gerrit.wikimedia.org/r/719284

Change 719337 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Avoid inserting wb_changes entry when there is no subscriber

https://gerrit.wikimedia.org/r/719337

Ladsgroup claimed this task.