Page MenuHomePhabricator

skip existing page content dump output files in bz2 and 7z generation jobs for each batch of commands
Closed, ResolvedPublic

Description

This is prep work for being able to split page content jobs, either bz2 or 7z, across multiple hosts. We need this for the wikidata run asap.

Event Timeline

ArielGlenn created this task.

Change 565301 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/dumps@master] for 7z production in batches, skip files that exist at beginning of each batch

https://gerrit.wikimedia.org/r/565301

Change 589032 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/dumps@master] check bz2 page content files for existence before running command batch

https://gerrit.wikimedia.org/r/589032

Change 565301 merged by ArielGlenn:
[operations/dumps@master] for 7z production in batches, skip files that exist at beginning of each batch

https://gerrit.wikimedia.org/r/565301

Change 589032 merged by ArielGlenn:
[operations/dumps@master] check bz2 page content files for existence before running command batch

https://gerrit.wikimedia.org/r/589032

Keeping this open until the code path gets used, probably during the May run.

Well. This got kept open rather longer than needed. This code has been running in production for months, closing!