Page MenuHomePhabricator

Spike: How will we migrate to Dumps 2.0?
Open, Needs TriagePublic


The migration path should be incremental so that work can be paced, risk is low, and service isn't disrupted. Also, old dumps can only be phased out once we have proven that the replacement is reliable, and it would be expensive to run redundant dumps.

Here's one straw person proposal for how to migrate:

  • Pick a pilot dump format.
  • Write a module which can process an arbitrary chunk of that dump's work.
  • Write a lightweight integration for celery which can spawn these pilot jobs and consolidate their output.
  • Milestone: Produces a completed dump for some wiki, in some format.
  • Optimize filesystem operations.
  • Milestone: Completes large wiki dump in reasonable time. Feasibility is demonstrated.
  • Create a shareable development environment and write documentation.
  • Put some love into the pluggable architecture.
  • Write a second and third dump format plugin.
  • Milestone: Regular jobs run for multiple formats. We can begin to deprecate Dumps 1.0 now.

Related Objects

Event Timeline

awight raised the priority of this task from to Needs Triage.
awight updated the task description. (Show Details)
awight added a project: Dumps-Rewrite.
awight subscribed.

I thought also about replacing small pieces of the existing dumps with pieces that could be used in Dumps 2.0, which we can't do for all of it but at least a part of the work could be snuck in that way. Example: job management.