HomePhabricator

command to recombine page content xml files into one
fc74d1aae61dUnpublished

Unpublished Commit · Learn More

Not On Permanent Ref: This commit is not an ancestor of any permanent ref.
This commit has been deleted in the repository: it is no longer reachable from any branch, tag, or ref.

Description

command to recombine page content xml files into one

[WIP]
This little program takes a list of bz2-compressed files, in
order, and writes the combined output (decompressed) to a specified
file or to stdout, stripping intermediate mediawiki headers (siteinfo)
and footers (</mediawiki> tag), so that the combined file will have
only the header of the first file at the beginning and a standard
footer on the end.

It needs some speed testing, as well as fixup to only look for intermediate
footers in the last bz2 block of each file.

Bug: T179059
Change-Id: I35cdf754ce4eea151dad3ead6bdd62a976a5870e

Details

Provenance
ArielGlennAuthored on Mar 8 2018, 2:05 PM
Parents
R1891:86c94729516f: gcc warning are now fatals (-Werror)
Branches
Unknown
Tags
Unknown
ChangeId
I35cdf754ce4eea151dad3ead6bdd62a976a5870e