Page MenuHomePhabricator

adds-changes dumps don't handle missing runs properly
Closed, ResolvedPublic

Description

Reported by solrize:

September 30 and October 1 runs failed (see https://lists.wikimedia.org/pipermail/xmldatadumps-l/2018-October/001437.html). The October 2 run does not contain the missing revisions, and the Oct 3 run contains the revisions also from the Oct 2 run, for enwiki at least.

Event Timeline

ArielGlenn triaged this task as Medium priority.Oct 5 2018, 9:43 AM
ArielGlenn created this task.

There are a few things I've found on looking at the code.

  • It doesn't backfill automatically; we'd be able to run the job for those dates and it should 'just work'. This needs testing; potentially the main index.html file for the adds changes dumps might be rewritten with info from these older runs.
  • Logging at verbose level is broken (patch coming).
  • It correctly finds the max rev id for the previous day's run from the db, even if the previous day's run did not complete. But it writes this information in the current run's max rev id file, which means that the next day's run will have duplicate revisions in it. (Patch coming)
  • dryrun mode still writes out a top level index.html file.
  • dryrun mode still creates the directory for the date of the run if it does not exist.

Change 465415 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/dumps@master] fix misc dumps generation when some previous runs are missing

https://gerrit.wikimedia.org/r/465415

Tests look good so far; I want to run one more tomorrow which will duplicate the circumstances of the duplicate revisions of the Oct 3rd run; if that pans out then this can be deployed.

Today's test run looks good. Merging and deploying.

Change 465415 merged by ArielGlenn:
[operations/dumps@master] fix misc dumps generation when some previous runs are missing

https://gerrit.wikimedia.org/r/465415

Going to wait for today's run to complete to make sure everything's ok before closing the ticket.

The run is working fine. Now we still don't backfill missing runs; I'm going to say that this should be a manual process, which can be done by one command from a screen session should we have such a problem again. That way it can be run on an idle host without doubling the length of the regular daily runs, which already take 15 hours to complete.

ArielGlenn renamed this task from adds-changes dumps don't fill in missing runs properly to adds-changes dumps don't handle missing runs properly.Oct 11 2018, 7:48 AM
ArielGlenn closed this task as Resolved.