Page MenuHomePhabricator

[Task] Avoid unserializing metadata, especially diffs, when processing changes
Closed, InvalidPublic

Description

DiffChange objects tend to have a lot of data in the info blob. We should avoid:

  • loading the info blob from the database, if it is not needed
  • unserializing the info blob
  • unserializing the diff array in the info blob
  • instantiating Diff objects, if not needed

It seems like the only reason to look into the diff when dispatching changes is to find out whether the change added a sitelink. We could store that information separately, e.g. in a separate field in the info blob.

See T109088: [Task] Profiling and investigate how to improve performance of change dispatcher for the rationale.

NOTE: We may not need to unserialize change info at all, if we can remove the need of programmatic filtering. See in particular T111161: Subscribe client wikis immediately after adding sitelinks

Event Timeline

daniel raised the priority of this task from to Needs Triage.
daniel updated the task description. (Show Details)
daniel subscribed.
Lydia_Pintscher renamed this task from Avoid unserializing metadata, especially diffs, when processing changes to [Task] Avoid unserializing metadata, especially diffs, when processing changes.Sep 1 2015, 12:09 PM
Lydia_Pintscher triaged this task as High priority.
Addshore subscribed.

Dispatching is now quite fast and we touched loads of stuff