command to recombine page content xml files into one
[WIP]
This little program takes a list of bz2-compressed files, in
order, and writes the combined output (decompressed) to a specified
file or to stdout, stripping intermediate mediawiki headers (siteinfo)
and footers (</mediawiki> tag), so that the combined file will have
only the header of the first file at the beginning and a standard
footer on the end.
It needs some speed testing, as well as fixup to only look for intermediate
footers in the last bz2 block of each file.
Bug: T179059
Change-Id: I35cdf754ce4eea151dad3ead6bdd62a976a5870e