The checkpointing is incompatible with the gzip format, something is not finalized in the output--or maybe we're concatenating on top of a truncated file without ending a block safely.
Description
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | None | T335411 Scraper: produce spreadsheet of scraped statistics for comparing wikis | |||
Open | None | T332032 Create baseline statistics for reference usage | |||
Open | None | T337450 Scraper: page-summary .gz can become corrupted after crash |
Event Timeline
Comment Actions
Find corrupted outputs:
(for i in reports/*.gz; do zcat $i > /dev/null || echo $i; done) > bad.txt