Page MenuHomePhabricator

Consider skipping or modifying recombine step for page content dumps for wikidata
Closed, ResolvedPublic

Description

This job is starting to take longer and longer. We might want to serve up 2 files for pages-meta-current, rather than the one large 27GB file. Same goes for the multistream dumps.

Event Timeline

ArielGlenn triaged this task as Medium priority.Oct 26 2017, 9:29 AM
ArielGlenn created this task.

Growth of pages is accelerating on wikidata, as has been noted elsewhere. You can see it in the file sizes. Compare:

ariel@snapshot1001:~$ ls -lh /mnt/data/xmldatadumps/public/wikidatawiki/2017*/wikidatawiki-2017*-pages-meta-current.xml.bz2*
-rw-rw-r-- 1 datasets ganglia 15G Jul  4 08:16 /mnt/data/xmldatadumps/public/wikidatawiki/20170701/wikidatawiki-20170701-pages-meta-current.xml.bz2
-rw-rw-r-- 1 datasets ganglia 16G Jul 23 14:04 /mnt/data/xmldatadumps/public/wikidatawiki/20170720/wikidatawiki-20170720-pages-meta-current.xml.bz2
-rw-rw-r-- 1 datasets ganglia 17G Aug  4 21:52 /mnt/data/xmldatadumps/public/wikidatawiki/20170801/wikidatawiki-20170801-pages-meta-current.xml.bz2
-rw-rw-r-- 1 datasets ganglia 20G Aug 24 17:11 /mnt/data/xmldatadumps/public/wikidatawiki/20170820/wikidatawiki-20170820-pages-meta-current.xml.bz2
-rw-rw-r-- 1 datasets ganglia 21G Sep  5 19:25 /mnt/data/xmldatadumps/public/wikidatawiki/20170901/wikidatawiki-20170901-pages-meta-current.xml.bz2
-rw-rw-r-- 1 datasets ganglia 24G Sep 25 18:26 /mnt/data/xmldatadumps/public/wikidatawiki/20170920/wikidatawiki-20170920-pages-meta-current.xml.bz2
-rw-rw-r-- 1 datasets ganglia 27G Oct  7 09:16 /mnt/data/xmldatadumps/public/wikidatawiki/20171001/wikidatawiki-20171001-pages-meta-current.xml.bz2
-rw-rw-r-- 1 datasets ganglia 24G Oct 26 09:24 /mnt/data/xmldatadumps/public/wikidatawiki/20171020/wikidatawiki-20171020-pages-meta-current.xml.bz2.inprog

The current run is not yet complete, of cuorse it will be over 27GB.

Now look at enwiki:

ariel@snapshot1001:~$ ls -lh /mnt/data/xmldatadumps/public/enwiki/2017*/enwiki-2017*-pages-meta-current.xml.bz2
-rw-rw-r-- 1 datasets ganglia 26G Jul  3 08:28 /mnt/data/xmldatadumps/public/enwiki/20170701/enwiki-20170701-pages-meta-current.xml.bz2
-rw-rw-r-- 1 datasets ganglia 26G Jul 22 01:13 /mnt/data/xmldatadumps/public/enwiki/20170720/enwiki-20170720-pages-meta-current.xml.bz2
-rw-rw-r-- 1 datasets ganglia 26G Aug  3 03:53 /mnt/data/xmldatadumps/public/enwiki/20170801/enwiki-20170801-pages-meta-current.xml.bz2
-rw-rw-r-- 1 datasets ganglia 26G Aug 22 04:49 /mnt/data/xmldatadumps/public/enwiki/20170820/enwiki-20170820-pages-meta-current.xml.bz2
-rw-rw-r-- 1 datasets ganglia 26G Sep  3 07:28 /mnt/data/xmldatadumps/public/enwiki/20170901/enwiki-20170901-pages-meta-current.xml.bz2
-rw-rw-r-- 1 datasets ganglia 27G Sep 22 10:56 /mnt/data/xmldatadumps/public/enwiki/20170920/enwiki-20170920-pages-meta-current.xml.bz2
-rw-rw-r-- 1 datasets ganglia 27G Oct  3 10:23 /mnt/data/xmldatadumps/public/enwiki/20171001/enwiki-20171001-pages-meta-current.xml.bz2
-rw-rw-r-- 1 datasets ganglia 27G Oct 22 12:04 /mnt/data/xmldatadumps/public/enwiki/20171020/enwiki-20171020-pages-meta-current.xml.bz2

Quite concerned where we will be in a year's time. In any case, something needs to be done soon.

It appears to take over two days for the recombine of the pages-articles files, and over two days for the recombine of the meta-current files. I'm going to skip this step when the combined sizes of the files to recombine is 20 GB or more, configurable.

In the meantime I should see if we can get more speed out of a dedicate C utility instead of a pipeline of bzcat | head | tail for each of these files in the recombine pipeline. I'll need this for the multistream dumps regardless.

The description mentions that you will also disable creation of the multistream dumps. I think that creation of the multistream dumps should not take very long if you have already created the multiple numbered files because you can simply cat them all together.

Also, how can I check if a new full dump of the split article files is ready? With the single file dump, I could just check the Last-Modified header of https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2 and compare it to what I already have to decide whether I have to reprocess the dump.

The description mentions that you will also disable creation of the multistream dumps. I think that creation of the multistream dumps should not take very long if you have already created the multiple numbered files because you can simply cat them all together.

To be clear, the multistream dumps job will probably be left untouched. I had planned only to stop producing the pages-articles and pages-meta-current combined files. Thus the need for a dedicated utility as mentioned in comment T179059#4024669

Also, how can I check if a new full dump of the split article files is ready? With the single file dump, I could just check the Last-Modified header of https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2 and compare it to what I already have to decide whether I have to reprocess the dump.

I would suggest checking the dumpstatus.json file in the directory of the run date for the wiki that interests you, e.g. https://dumps.wikimedia.org/wikidatawiki/20180301/dumpstatus.json
If the status is 'done', then you know that the files are ready; they are also in the json output for easy retrieval.

Well, I"m not going to make the 20th deadline. I need to do a pile of timing tests with this code to see if it's faster than the horrid head/tail pipeline I have been using until now, for combining files. There's a bunch of variants needed: do the compression in the same C program? Pipe it, does it make much difference? Buffer size? Etc. When I have those results we'll have a decision about the way forward.

Change 421011 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/dumps/mwbzutils@master] command to recombine page content xml files into one

https://gerrit.wikimedia.org/r/421011

Time for an update.

Here's the results of a few basic tests of various dump-related operations over NFS vs local disk.

ariel@snapshot1001:/mnt/dumpsdata/xmldatadumps/public/wikidatawiki/20180301$ date; bzcat wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2 | tail -n +40 | head  -n -1 > /dev/null ; date
Tue Mar 20 17:22:59 UTC 2018
Tue Mar 20 17:56:15 UTC 2018

Very fast; you see that the overhead from the tail | head pipeline is negligible. Bzcat over NFS doesn't seem to be too awful. A rerun of this command takes about the same length of time, so our results of later tests won't be skewed by disk caches.

ariel@snapshot1001:/mnt/dumpsdata/xmldatadumps/public/wikidatawiki/20180301$ date; bzcat wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2 | bzip2 > /dev/null ; date
Tue Mar 20 19:20:42 UTC 2018
Wed Mar 21 03:10:45 UTC 2018

Horrible. With the tail | head pipeline inserted, the time is about the same.
Note that all of the above are on the NFS share actively being used by an ongoing dump run.

ariel@dumpsdata1002:/data/xmldatadumps/public/wikidatawiki/20180301$ date; bzcat wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2 | bzip2 > /dev/null ; date
Wed Mar 21 07:16:25 UTC 2018
Wed Mar 21 11:46:57 UTC 2018

A few hours faster. We'd like to get those hours back if possible. Note that this was run directly on the inactive dumps NFS share host, using that filesystem locally.

root@snapshot1001:/mnt/dumpsdata/xmldatadumps/public/wikidatawiki/20180301# mount -t nfs dumpsdata1002.eqiad.wmnet:/data /mnt/datatesting -o "bg,hard,tcp,rsize=8192,wsize=8192,intr,nfsvers=3"
ariel@snapshot1001:/mnt/datatesting/xmldatadumps/public/wikidatawiki/20180301$ date; bzcat wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2 |tail -n +40 | head  -n -1 | bzip2 > /dev/null ; date
Wed Mar 21 15:18:18 UTC 2018
Wed Mar 21 23:12:09 UTC 2018

This run is back over NFS but using the inactive host as the NFS server. 8 hours again, just like using the active host.
I'm playing with mbuffer right now to see if there's any difference; after that I'm going to look at the local disk performance on snapshot1001 + time for copy the file wholesale from the NFS share to the local disk.

Note also that none of these actually try to *write* to the filesystem, whether NFS or local.

All tests shown below. Summary: NFS vs non-NFS does not make the difference, as shown by running the bzcat | bzip > /dev/null on local host w/o NFS and comparing to run on the NFS server directly.
Using bzip -1 is not tenable, even though it's nearly 50% faster, because the size of the output nearly doubles.
This means there's not at the moment any good way to speed up the recombine step, either by copying the file to local disks and then processing it, or anything else.

If we don't generate the combined pages-articles bz2 file, we'll have to do the equivalent work to write the pages-articles multistream file as one file. so we won't gain any time there.
We can stop providing the combined pages-meta-current file without taking a hit, since there is nothing in the dump production pipeline that uses it.


Tests:
NFS, bzcat -> tail,head pipeline
ariel@snapshot1001:/mnt/dumpsdata/xmldatadumps/public/wikidatawiki/20180301$ date; bzcat wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2 | tail -n +40 | head -n -1 > /dev/null ; date
Tue Mar 20 17:22:59 UTC 2018
Tue Mar 20 17:56:15 UTC 2018

NFS, bzcat -> tail,head pipeline repeat, to check impact of any caching buffers
ariel@snapshot1001:/mnt/dumpsdata/xmldatadumps/public/wikidatawiki/20180301$ date; bzcat wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2 | tail -n +40 | head -n -1 > /dev/null ; date
Tue Mar 20 18:36:21 UTC 2018
Tue Mar 20 19:09:35 UTC 2018

NFS, bzcat, to check impact of tail,head pipeline
ariel@snapshot1001:/mnt/dumpsdata/xmldatadumps/public/wikidatawiki/20180301$ date; bzcat wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2 | bzip2 > /dev/null ; date
Tue Mar 20 19:20:42 UTC 2018
Wed Mar 21 03:10:45 UTC 2018

NFS, bzcat -> tail,head pipeline -> bzip2 -> /dev/null
ariel@snapshot1001:/mnt/dumpsdata/xmldatadumps/public/wikidatawiki/20180301$ date; bzcat wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2 |tail -n +40 | head -n -1 | bzip2 > /dev/null ; date
Wed Mar 21 05:56:48 UTC 2018
Wed Mar 21 13:59:08 UTC 2018

Secondary (inactive) server w/o NFS, bzcat -> tail,head pipeline -> bzip2 -> /dev/null
ariel@dumpsdata1002:/data/xmldatadumps/public/wikidatawiki/20180301$ date; bzcat wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2 | bzip2 > /dev/null ; date
Wed Mar 21 07:16:25 UTC 2018
Wed Mar 21 11:46:57 UTC 2018

root@snapshot1001:/mnt/dumpsdata/xmldatadumps/public/wikidatawiki/20180301# mount -t nfs dumpsdata1002.eqiad.wmnet:/data /mnt/datatesting -o "bg,hard,tcp,rsize=8192,wsize=8192,intr,nfsvers=3"

NFS from secondary (inactive) server, bzcat -> tail,head pipeline -> bzip2 -> /dev/null
ariel@snapshot1001:/mnt/datatesting/xmldatadumps/public/wikidatawiki/20180301$ date; bzcat wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2 |tail -n +40 | head -n -1 | bzip2 > /dev/null ; date
Wed Mar 21 15:18:18 UTC 2018
Wed Mar 21 23:12:09 UTC 2018

NFS from secondary (inactive) server, bzcat -> mbuffer -> bzip2 -> /dev/null
ariel@snapshot1001:/mnt/datatesting/xmldatadumps/public/wikidatawiki/20180301$ date; bzcat wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2 | mbuffer -m 10M | bzip2 > /dev/null ; date
Thu Mar 22 07:49:45 UTC 2018
summary: 68.7 GByte in 7 h 26 min 2688 kB/s
Thu Mar 22 15:16:17 UTC 2018

Copy to local disks
ariel@snapshot1001:/mnt/datatesting/xmldatadumps/public/wikidatawiki/20180301$ date; cp wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2 /home/ariel; date
Thu Mar 22 16:50:12 UTC 2018
Thu Mar 22 16:50:17 UTC 2018

On local disks, bzcat -> bzip2 -> /dev/null
bzcat -> bzip2 -> /dev/nullariel@snapshot1001:/mnt/datatesting/xmldatadumps/public/wikidatawiki/20180301$ date; bzcat /home/ariel/wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2 | bzip2 > /dev/null ; date
Thu Mar 22 16:51:46 UTC 2018
Fri Mar 23 00:38:57 UTC 2018

On local disks, bzcat -> bzip2 -> /dev/null with -5 (less memory, larger output files)
ariel@snapshot1001:/mnt/datatesting/xmldatadumps/public/wikidatawiki/20180301$ date; bzcat /home/ariel/wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2 | bzip2 -5 > /dev/null ; date
Fri Mar 23 09:35:11 UTC 2018
Fri Mar 23 16:26:32 UTC 2018

On local disks, bzcat -> bzip2 -> /dev/null with -1 (less memory, larger output files)
ariel@snapshot1001:/mnt/datatesting/xmldatadumps/public/wikidatawiki/20180301$ date; bzcat /home/ariel/wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2 | bzip2 -1 > /dev/null ; date
Fri Mar 23 17:09:17 UTC 2018
Fri Mar 23 21:50:50 UTC 2018

On local disks, bzcat -> bzip2 -> local file with -1 (less memory, larger output files)
ariel@snapshot1001:/mnt/datatesting/xmldatadumps/public/wikidatawiki/20180301$ date; bzcat /home/ariel/wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2 | bzip2 -1 > /home/ariel/pages-out.bz2 ; date
Sat Mar 24 07:19:13 UTC 2018
Sat Mar 24 12:01:01 UTC 2018

relative sizes from bzip2 default (-9) to bzip2 -1 (fastest, largest output files):
ariel@snapshot1001:/mnt/datatesting/xmldatadumps/public/wikidatawiki/20180301$ ls -l /home/ariel/pages-out.bz2 /home/ariel/wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2
-rw-rw-r-- 1 ariel wikidev 5053510601 Mar 24 12:01 /home/ariel/pages-out.bz2
-rw-rw-r-- 1 ariel wikidev 3001809958 Mar 22 16:50 /home/ariel/wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2
ariel@snapshot1001:/mnt/datatesting/xmldatadumps/public/wikidatawiki/20180301$ ls -lh /home/ariel/pages-out.bz2 /home/ariel/wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2
-rw-rw-r-- 1 ariel wikidev 4.8G Mar 24 12:01 /home/ariel/pages-out.bz2
-rw-rw-r-- 1 ariel wikidev 2.8G Mar 22 16:50 /home/ariel/wikidatawiki-20180301-pages-meta-current27.xml-p37586178p39086178.bz2

atop didn't show anything interesting during the time frames of these tests as far as bottlenecks

CPU info

snapshot1001:    AMD Opteron(tm) Processor 6134 (snapshot1001)                  -- L2 - 8 x 512 KB, L3 - 12 MB, 2.3 GHz
dumpsdata hosts: Intel(R) Xeon(R) CPU E5-2623 v4 @ 2.60GHz (dumpsdata1002)      -- L1I$ 128 KiB, L1D$ 128 KiB, L2$ 1 MiB,L3$ 10 MiB, 2.60GHz

Memory:
snashot1001 has 64GB, dumpsdata hosts have 32GB.


Note that snapshot1001 is a testbed, doing nothing else. Likewise dumpsdata1001 is an active nfs server, but dumpsdata1002 is in reserve, doing nothing but getting regular updates va rsync.

Anyone who wants to play with these commands on either of the two above hosts may do so; check that someone else isn't already testing (so they don't skew your stats), and make sure you're not going to fill the disk if you write a local file out, otherwise feel free.

Change 421858 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/dumps@master] Add ability to skip recombine of meta-current page content, per project

https://gerrit.wikimedia.org/r/421858

Change 421858 merged by ArielGlenn:
[operations/dumps@master] Add ability to skip recombine of meta-current page content, per project

https://gerrit.wikimedia.org/r/421858

Change 422914 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] turn off production of single meta-current page content dump for wikidata

https://gerrit.wikimedia.org/r/422914

Change 422914 merged by ArielGlenn:
[operations/puppet@production] turn off production of single meta-current page content dump for wikidata

https://gerrit.wikimedia.org/r/422914

Some rough wall-time numbers for lbzip2 vs bzip2 on hardware similar to dumpsdata1002.

That looks pretty good. I wonder how it performs on a file over NFS.

Change 428153 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] add lbzip2 to the snapshots

https://gerrit.wikimedia.org/r/428153

Change 428153 merged by ArielGlenn:
[operations/puppet@production] add lbzip2 to the snapshots

https://gerrit.wikimedia.org/r/428153

Change 428156 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/dumps@master] enable lbzip2 use for all decompression parts of recompression jobs

https://gerrit.wikimedia.org/r/428156

Messing with bzip2, pbzip2, lbzip2 on snapshot001, looking at timing only (not archive size or non-default config).

Results are slowest of multiple runs each. bzip2 and pbzip2 are distro packages. lbzip2 was built with -g -O2.

bzip2 decompress baseline

springle@snapshot1001:/mnt/dumpsdata/xmldatadumps/public/wikidatawiki/20180401$ time bzcat wikidatawiki-20180401-pages-meta-current27.xml-p37586178p39086178.bz2 | tail -n +40 | head -n -1 > /dev/null

real    47m21.883s
user    43m54.728s
sys     8m59.492s

pbzip2 decompress

pbzip2 decompression speed relies on the archive having also been compressed with pbzip2, so this run is essentially single-threaded. This seems like a great reason to avoid a pbzip2 dependency.

springle@snapshot1001:/mnt/dumpsdata/xmldatadumps/public/wikidatawiki/20180401$ time pbzip2 -d --stdout wikidatawiki-20180401-pages-meta-current27.xml-p37586178p39086178.bz2 | tail -n +40 | head -n -1 > /dev/null

real    42m22.116s
user    43m18.140s
sys     6m17.380s

lbzip2 decompress

lbzip2 doesn't care how the archive was compressed, which is much nicer. All cores were in use.

springle@snapshot1001:/mnt/dumpsdata/xmldatadumps/public/wikidatawiki/20180401$ time ~/src/lbzip2-2.5/src/lbzip2 -d --stdout wikidatawiki-20180401-pages-meta-current27.xml-p37586178p39086178.bz2 | tail -n +40 | head -n -1 > /dev/null

real    4m5.242s
user    77m17.736s
sys     8m40.092s

pbzip2 decompress / recompress

pbzip2 didn't seem to want to use all cores even though it controls both compress and decompress. Possibly some config could be tweaked.

springle@snapshot1001:/mnt/dumpsdata/xmldatadumps/public/wikidatawiki/20180401$ time ~/src/lbzip2-2.5/src/lbzip2 -d --stdout wikidatawiki-20180401-pages-meta-current27.xml-p37586178p39086178.bz2 | pbzip2 > /dev/null

real    20m15.246s
user    635m8.480s
sys     10m39.828s

lbzip2 decompress / recompress

Again, all cores in use.

springle@snapshot1001:/mnt/dumpsdata/xmldatadumps/public/wikidatawiki/20180401$ time ~/src/lbzip2-2.5/src/lbzip2 -d --stdout wikidatawiki-20180401-pages-meta-current27.xml-p37586178p39086178.bz2 | ~/src/lbzip2-2.5/src/lbzip2 > /dev/null

real    8m17.457s
user    260m2.544s
sys     4m34.056s

lbzip2 beats the socks off pbzip2 and so far works seamlessly with bzip2.

That's good news about lbzip2. Should I set you up with a test directory so you can run dump steps using it and see how that is compared to bzip2?

On a side note, we alert for long-running screens/tmux, see https://gerrit.wikimedia.org/r/#/c/427195/ after 20 days. You might want to start from fresh every so often.

@Springle, the new testbed host is snapshot1009; you should have access and I moved your src directory over there from your home dir.

Also @Springle, you now have a scratch area on /mnt/dumpsdata/temp/springle available from snapshot1009. This is in the same filesystem as regular dumps, ie storage on the NFS server, do please don't run 20 compresses at once :-) If you do want to test 20 at once for some reason, holler and I'll mount the share from the hot idle spare for you.

Group ownership might need fiddling with if you want to run actual dump jobs to write there. Generally you would run those as www-data, which I believe is not in the wikidev group, so let me know if you need that. I would want to talk with you about custom config files (so output goes to the scratch area) and such anyways.

I've done some followup testing with lbzip2, using multiple processors with read from stdin via pipe, examined the low level format of the output files as compared to bzip2 output, and checked memory use. A sample memory use comparison:

bzcat wikidatawiki-20180920-pages-meta-current27.xml-p37586178p39086178.bz2 | head -n -10 | tail -n +10 | time lbzip2 -n 4 > /mnt/dumpsdata/temp/ariel/wdstuff-lbzip2-3.bz2
4929.98 user   81.65 system   33:11.75 elapsed   251 %CPU (0 avgtext + 0 avgdata 23976 maxresident) k

bzcat wikidatawiki-20180920-pages-meta-current27.xml-p37586178p39086178.bz2 | head -n -10 | tail -n +10 | time bzip2 > /mnt/dumpsdata/temp/ariel/wdstuff-bzip2-3.bz2
(tbd, still running; current figures from ps show RSS 13608k)
15195.56 user   55.02 system   4:50:19 elapsed   87 %CPU   (0 avgtext + 0 avgdata 7456 maxresident) k

This is well below the per-core usage for 7za:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
dumpsgen 27227 46.0  2.0 4231116 1343312 ?     Sl   08:50  32:53 /usr/lib/p7zip/7za a -mx=4 -si /mnt/dumpsdata/xmldatadumps/public/enwiki/20181001/enwiki-20181001-pages-meta-history27.xml-p42663462p42893893.7z.inprog

So I'm going to start using this for compression of the output for the page content bz2 files recombine dump step, and we'll see how it looks. I'll wait on the decompression piece for now.

Change 466344 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/dumps@master] use lbzip2 for recombining page content dumps, if available and configured

https://gerrit.wikimedia.org/r/466344

The above has been tested and is ready for merge and deployment as soon as the current dump run completes, probably in 2-3 days. I"ll prep a changeset for the config setting update in puppet too.

Change 466554 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] dumps config settings to use lbzip2 for recombining page content files

https://gerrit.wikimedia.org/r/466554

Change 466344 merged by ArielGlenn:
[operations/dumps@master] use lbzip2 for recombining page content dumps, if available and configured

https://gerrit.wikimedia.org/r/466344

Change 466554 merged by ArielGlenn:
[operations/puppet@production] dumps config settings to use lbzip2 for recombining page content files

https://gerrit.wikimedia.org/r/466554

These changes are now live on the snapshot hosts and will be in effect for the Oct 20th run. The hosts should be monitored more closely than usual to check load and disk i/o (including especially the nfs servers).

The runs have been going along just fine, so we can close this.

Change 428156 abandoned by ArielGlenn:
enable lbzip2 use for all decompression parts of recompression jobs

Reason:
Tossed in favor of
I8f8f6e3dcfe519d3ccf51928743e3f4fc140d003
which just applies to 7z recompression.
In the other cases we are dealing with bz2 recompression. That will be slower than the decompression, no point in swapping in lbzip2 there.

https://gerrit.wikimedia.org/r/428156